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Logging On 
Wharton’s Host Systems 


The Wharton School has several “host” computer systems available 
for remote access, including Wharton’s Unix systems (which include 
hosts Futures, Equity, and Assets), and Wharton’s Academic VMS 
System (host Wilma). 


Before logging on to a Wharton host system, you need an account 
on that system. Accounts for most Wharton systems are available 
from Wharton’s Accounts Coordinator in 212 Vance Hall. For more 
information, contact the Accounts Coordinator at 898-0750. 


WHARTON REPROGRAPHICS Ye, 


If you plan to connect to a host system using PennNet or PennNet's 
dial-in lines, you also need a PennNet network ID and password. 
With a valid PennCard, you can get a PennNet ID and password 
from several locations including Wharton’s Accounts Coordinator 
in 212 VH. For questions or more information, contact Wharton’s 
computer consultants at 898-8600 or the PennNet Services Center 
at 898-8171. | 


Once you have an account on the host system, to use the system you 
need to (1) connect to the system and then (2) log on as described the 
following sections. 


Access to Wharton's host systems is available over Wharton's 
Ethernet network or PennNet through several methods: 


> You can access most Wharton host systems from the 
microcomputers in Wharton's DOS/Windows computer labs, 
including thé MBA Lab in 210-11 VH, the DOS/Windows 
Lab in 114 SH-DH, and the Training Lab in 116 SH-DH. 
(see "Connecting from the DOS/Windows Computer Labs"). 


» If you have a modem, communications software, and a PennNet 
network ID and password, you can also access Wharton's host 
systems by dialing in to PennNet over telephone lines from 
off-campus or from the Sun Lounge "Laptop Computer Bar" 
in Vance Hall (see "Connecting by Dialing In"). 


» Many offices can access host systems through a connection to 
Wharton’s network (see “Connecting from Wharton Offices"). 


Once you connect to the system, you need to log on, typically by 
entering your account name (or “username’’) and then your password 
(see “Logging Оп”). On some systems you are also required to | 
change your password when you log on for the first time (see 
“Changing Your Password”). 
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Connecting from the DOS/Windows Computer Labs 


The DOS/Windows computer labs in Steinberg Hall-Dietrich Hall 
and Vance Hall offer high-speed Ethernet connections to Wharton's 
network. To connect to a Wharton host system from a lab computer, 
do the following: 


Step 1 Start the lab computer and go into Windows as follows: 


> If the lab computer is off, turn on the computer and, if necessary, 
the monitor. (The monitor may take several seconds to warm up.) 


> If the computer is already on, simultaneously press the Ctrl, Alt, 
and Del keys to reboot the computer. 


» When the screen prompts you to ^Enter your user ID:," 
enter your last name. (You don't need to enter a password.) 


It 1s important to reboot the computer and enter your own "user ID" 
each time you use a lab computer. If you need to reboot the 
computer again while you're using it, enter the same user ID. 

This prevents the computer from performing г a routine purge of the 
hard disk and erasing your files. 


Once you enter your user ID, you will see the lab's Main Menu. 
> From the Main Menu, type win to start Microsoft Windows. 
Step 2 Open a telnet session to the host system as follows: 


> Several commonly-used Wharton host systems—such as 
Unix systems Equity, Futures, and Assets, and VMS system 
Wilma—have their own icon in the “Network Host Access" group. 





ш If the host you want is listed, simply double-click its icon. 


> If you want to connect to another host system, or a system outside 
Wharton, do the following: 


а Double-click on the “Telnet” icon, · 
grosso lins displays an "Open Session". dialog box. 





m Type-the name of the system in the.Hostname box. 


aigan sekira Y iau For a system outside Wharton, enter-the full Internet name of the 
Ginen aos host (for example archie.rutgers,edu). 


You can leave the Session Name blank. - : 

ostio Qu serm пао eel ORS июн: Wess a 
| . If you've entered a valid host name, yod'lí see a "Connecting to host" 
epi зува 2-22 message as you connect to the system.: Depending on the system, you 
Уши may see а brief message identifying the-system and then be prompted 


for your user name (see "Logging On"). When you are finished, be 
sure to log off the system (see "Logging Off’). 
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Logging On Wharton's Host Systems 


Connecting by Dialing In. 


If you have a modem and communications software (such as 
MS-Kermit, ProComm, or MicroPhone П) you can connect to 
Wharton's host systems over telephone lines by dialing in to PennNet, 
the University's communications network. To use PennNet, you also 
need a PennNet network ID and password (see page 1). 


To dial in through PennNet: 
Step 1 Set up your modem and configure your communications software. 


>» The communications settings for PennNet are 8 data bits and 
no parity. PennNet supports transfer rates of 1200, 2400, 9600, 
and 14,400 bps. 


> Select VT100 ог VT102 terminal emulation. 


Refer to the instructions for your modem and communications 
software for more information. 


· Step 2 Start your communications software and dial PennNet at 
898-0834. 


> From an on-campus phone—including the phone lines at Wharton’s 
Laptop Computer Bar—simply dial 8-0834. 


Most communications packages offer automatic dialing—you simply 
need to tell the program the phone number. If your software doesn’t 
offer automatic dialing, you need to enter terminal emulation, type 
in a “dialing prefix” code and then the phone number. On most 
modems, the dialing prefix for a touch-tone line is ATDT, and for a 
pulse or rotary line is ATDP. 


For example, to dial PennNet from an off-campus phone using a 
touch-tone telephone line, enter ATDT8980834. 


Step 3 Once your modem connects to решна, press the ш key 
a few times. TNNT 
PennNet responds with a connect message that begins: 
Apia Command: Line^Tnterpreter 
use Step 4. “If necessary, press the Enter kéy once riore until PennNet displays 
- several lines of information” and thén asks for your network ID: 
Network ID: Pee T os 


Step 5 Enter your PennNet network ID-and;-when "dm your network 


"P inka, b E патака 


ома WO no меолруошг riet work ID and Bike Word и are are “PennNet displays the 
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Logging On Wharton’s Host Systems 


If you make a mistake entering your network ID or password, 
PennNet lets you try again. After three tries, however, PennNet logs 
you out and you must start over again with step 2. 


From the annex: prompt, enter telnet hostname.wharton 
where hostname is the system you want to connect to. 


For example, to connect to Wharton's Unix system Equity enter 
telnet equity.wharton. To connect to Wharton’s VMS 
system Wilma enter telnet wilma.wharton. 


From the annex: prompt you can also connect to non-wharton 
systems, such as Penn's library (telnet library), PennInfo 
(telnet penninfo) or any Internet host system. 


Once you're connected to the host system you may see a brief 
message identifying the system. You are now ready to log on (see 
"Logging On"). When you are finished, be sure to log off the system 
(see "Logging Off"). 


Connecting from Wharton Offices 


Many offices within the School have connections to Wharton's 
network, either through an asynchronous connection to a terminal 
server or a direct Ethernet connection. 


If you're connected asynchronously through a terminal server, you'll 
typically see one of the following prompts—“DIAL:”, “annex:”, or 
“Lecal>”. See the steps in “From an Asynchronous Connection" to 
connect to a host system. 


If you have a direct Ethernet connection, you won't see a 
terminal-server prompt, and can connect directly from your computer 
to the host system using a TCP/IP connection. See “From an Ethernet 
Connection Using TCP/IP,” below for more information. 


From an Ethernet Connection Using TCP/IP 


If you have a direct Ethernet connection to Wharton's network 
you can connect to a host system at Wharton (or anywhere on the 
Internet) using a communications protocol known as “TCP/IP.” 


How you connect to a host system depends on the type of TCP/IP 
software you're using. 


> In most cases you can simply enter telnet hostname, where 


hostname is the name of the host system: 


Depending on your TCP/IP software, you may use the command 
tnvt hostname ог сп hostname. If you're using Windows or a 
Macintosh, you may have version of TCP/IP that allows you to select 
a host name from a menu or by entering it into a dialog box. 
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Step 1 


Step 2 


Step 3 


Logging On Wharton's Host Systems 


Contact your departmental computing support person if you're not 
certain how your TCP/IP software works. 


To connect to a Wharton host from within the Wharton School, you 
only need to specify the system's name. For example, to connect to 
system Futures enter telnet futures. 


To connect from a remote Internet site, you need to give the full 
Internet name of the system (referred to as a “fully-qualified domain 
name"). For example, to connect to Futures from another Internet 
site, enter telnet futures.wharton.upenn.edu. 

From an Asynchronous Connection 


To connect to a Wharton host system from a Wharton office with an 
asynchronous connection, do the following: 


Start your communications software. 


Refer to the manual for your communications software for more 
information. 


Press the Enter key several times to get to the connection prompt. 
If requested, type in your username and press the Enter key. 


Depending on how your office is connected, the screen should show 
one of the following prompts: 


DIAL: (Go to Step 3.) 

Or 

annex: (Go to Step 4.) 

Or 

Local» (Go to Step 5.) 

If the screen says DIAL: enter telnet. 


The screen says: 


. RINGING 
- ANSWERED 


» Press the Enter key a few times to display the Network ID: 
prompt. = 


PennNet responds with a connect message that begins Annex 
Command Line Interpreter, and then asks for your 
network ID: ; 


Network ID: 


.. 7 P Enter your PennNet network ID and; when prompted, your network 
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Logging On 


Step 4 


Step 5 


Step 1 


Step 2 


Logging On Wharton’s Host Systems 


If your network ID and password are correct, PennNet displays the 
annex: prompt; continue with the next step. 


If the screen says annex: entertelnet hostname.wharton 
where hostname is the system you want to connect to. 


For example, to connect to Wharton’s Unix system Equity, enter 
telnet equity.wharton. To connect to Wharton's academic 
VMS system, enter telnet wilma.wharton. 


If the screen says Local>, enterc hostname where hostname is 
the system you want to connect to. 


From the Local> prompt, you can only connect to Wharton’s 
VMS-based systems, such as host Wilma. To connect to Wilma, for 
example, enter с wilma. 


After completing step 3, 4, or 5 you should be connected to the host 
system. Depending on the system, you may see a brief message 
identifying the system. You are now ready to log on. 


Once you connect to a system you'll typically see a brief connection 
message that identifies the system, and will then be asked to identify 
your account or username. 


A Unix system displays the login prompt: 

login: | 

On a VMS-based system you're asked for your username: 
Username: 


Type in the user name you were assigned when you opened your 
account, and then press the Enter key. 


On a Unix system, user names and passwords are case sensitive, and 
must be entered with the correct capitalization. 


For most accounts, you need to enter a password: 
Password: 

Type in your password and then press the Enter key. 

The password does not appear on the screen when you type it. 


If you make an error entering your username or your password, you'll 
see an error message such as "User authorization failure" 
ог “Login incorrect." If this happens, press the Enter key if 
necessary to return to the Username: or Login: prompt, and 
repeat steps 1 and 2. 
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Logging On Wharton’s Host Systems 


If you enter your username and password correctly, you will usually 
see a log on message. 


On some systems, you'll be asked to select a terminal type: 
TERM = (vt100) 


Unless you're using a special type of terminal emulation, you can 
either enter vt 100 or simply press the Enter key. 


Once you're logged on, you'll see the system prompt—typically a 
percent sign (%) for a Unix system or a dollar sign ($) for a VMS 
systern. On some systems, the system prompt may include the name 
of the system or the current working directory. | 


On some Unix systems you'll see an opening menu listing many 
commonly-used commands. From the Unix system prompt ($), you 
can enter menu to redisplay this list of commands. 


On certain systems (like Wharton's VMS systems), you are required 
to change your password when you log on for the first time. On other 
systems you may want to select a new password (particularly if one 
was automatically assigned to you when the account was created). 
See the next section for more information on changing passwords. 


When you are finished using the system, be sure to log off before you 
leave (see "Logging Off," below). 
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Logging On Wharton’s Host Systems 


Changing Your Password 


Step 1 


Step 2 


Step 3 


Step 4 


When you log on to Wharton’s academic VMS systems for the first 
time, you must change the password you were assigned. Follow the 
instructions below beginning at Step 2. 


Whatever system you’re using, you may want to select a new 
password when you first use the system. 


The exact steps for changing your password depend upon the system 
you're using. On most Wharton systems, you can change your 
password as follows: 


At the system prompt, enter the following: 

> On a Unix system, enter passwd at the Unix system prompt (%) 
> On a VMS system, enter set password at the VMS prompt (S) 
You'll then be prompted for your old password. 


When asked for your old password, type in your current password 
and then press the Enter key. 


You'll then be asked for the new password you'd like to use. 
Type in a new password. 


Your password should be at least 6 characters long. On Unix systems 
passwords are case sensitive—lower-case and upper-case letters are 
different. To avoid problems, it's a good idea to use only lower-case 
letters, numbers, and the underscore, period, or exclamation mark 
characters. Do not use the pound sign or the “at” sign in your 
password. 


The password does not appear on the screen when you type it. 


To make sure you typed your new password correctly, you are asked 
to enter it again. 


Type in your new password again. 


If you typed the same password both times, the system returns to the 
system prompt. From now on, use this password to log on to this 
system. 


If you make a mistake typing your password, you'll receive an error 
message, and the system does not change your password. If this 
happens, repeat steps 2 through 4. 
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Step 3 


Step 4 


Logging On Wharton's Host Systems 


When you are finished using any networked host system, make sure 
you log off the system when you're finished. 


At the system prompt, enter the following: 
> On a Unix system, enter logout at the Unix system prompt (£) 
> On a VMS system, enter Logoff at the VMS prompt ($) 


If you connected to the system asynchronously through a terminal 
server, log off the terminal server: 


> If your screen says annex: type hangup to log off the server. 
> If your screen says Local> type logout to log off the server. 
Exit your communications software. | 

How you do this depends on the software you're using. 


> If you're in Wharton's DOS/Windows labs and you selected a 
host from a Program Manger icon, when you log off you will 
automatically exit the Telnet program and return to Windows. 


» If you're in Wharton's DOS/Windows labs and you selected the 
Telnet icon and entered a host name in the “Open Session" dialog 
box, when you log off the host system, the “Open Session" dialog 
box reappears. To return to the Windows Program Manager, Click 
the "Cancel" button and then select Exit from the File menu. 


> If you're using MS-Kermit, press Alt x to exit Kermit and 
retum to DOS. 


> If you're using the DOS version of ProComm, press Alt x and then 
type a Y when asked "EXIT TO DOS?" 


If you're using a different communications package, consult 
your manual. | 


If you are finished using a computer in Wharton’s labs, exit Windows 
and turn off the computer. 
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Files and Directories: 
Wharton’s Unix Systems 


File and Directory Names 


Unix file names are limited to 14 characters, and certain 
nonalphabetic characters cannot be used in the file name. To avoid 
problems, it's a good idea to use only letters, numbers, the underscore 
character, and the period when naming your files. 


Unlike DOS, VMS, and most other operating systems, Unix is case 
sensitive—lower-case and upper-case letters are different. This means 
that myfile, MyFile, and MYFILE are three different file names 
and must be typed with the correct capitalization. 

File names that begin with a period are "hidden" files and are not 
normally displayed. For example, most users have the file . login 
in their home directory, but this file is not displayed with the 1s 
commmand. Do not remove or modify hidden files unless you are 
familiar with their purpose. 


Specifying Files and Directories with Path Names 


When you specify a file by its name only, it refers to a file in the 
current directory. To refer to a file in different directory, you need to 
include the path name for the file. 


An absolute path to a file or directory lists all the directories from 
the top-level "root" directory down to the file or directory, with each 
directory name separated by a forward slash (/) character. 


For example, /users/welles/temp.txt refers to the file 
cemp.txt in the directory welles which is in the directory 
users. | 


You can also refer to a file or directory by using a relative path, 
which locates the file or directory relative to your current working 
directory or by using several abbreviations for a full path: 


- Your current directory. 
.. Up one level from your current directory. 
= Your home or login directory. 
/ The top-level or “тоо?” directory. 


O Copyright 1994 The Wharton School of the University of Pennsylvania 1 


CIT TECHBRIEF 


Files and Directories: Wharton's Unix Systems 


Moving Around the File System 


You are always "in" a specific directory, which is used as the default 
directory for most commands. 


pwd "Prnt working directory;" shows your current directory. 


cd dirname “Change directory;" moves you to а new default or “working” 
directory. 
Examples: 
cd /usr/local/bin 
Moves you to the directory /usr/local/bin (if it exists). 


cd - 
Returns you to your home directory. 


eO us 
Moves you up one level to the “parent” of the current directory. 


If you don't specify a directory when you use cd, you are placed 
in your home directory. (This is different from MS-DOS, where cd 
shows you your current directory. To see your current directory in 
Unix, use pwd.) 
Displaying Files 
To see the names of the files in a а 
ls filespec Displays a list of the names of the files identified by filespec. 
Examples: 
ls 
. Displays the files in the current directory. 
ls *.ps 
Displays the files in the current directory with the extension .р5. 


ls /usr/local/bin 
Displays the files in the directory /usr/local/bin. 


ls -/w* 
Displays the files in your home directory that begin with the letter w. 
ls -a 


Displays all the files in the current directory, including “hidden” files 
(files with names that begin with a period). 


ls -1 

Displays a long listing of the files in the current directory, which 
includes information on each file's access rights, owner, group, size, 
modification date, and name. 
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cat filename 


more filename 


less filename 


find dir -name 


Files and Directories: Wharton’s Unix Systems 


To see the contents of a file: 
Displays the contents of a file. 


Displays the contents of a file one screen at a time. To display the 
next screen, press the space bar. To exit, press the q key. 


Displays the contents of a file one screen at a time, and allows you to 
scroll forward or backward in the file. 


To display the next screen, press the space bar or the f key. To scroll 


back to the previous screen, press the b key. To exit, press the q 
key. 


Finding Files 
To look for a file by name: 


‘filename’ -print 


Searches for the file filename in the directory dir or directories 
below that directory. 


Examples: 


find ~ -name '*.for' -print | 
Lists the names of all the files with the extension . for in your home 
directory, or any subdirectories beneath it. 


find / -name ‘wharton.txt’ -print 
Searches the entire disk (from the top-level “root” directory on down) 
for files named wharton.txt. 


To look for a file based on its content: 


grep 'string' filename 


Lists all the files that contain the text string. 
Examples: 


grep 'Wharton' *.txt 
Lists all the files with the extension *.txt that contain the text 
“Wharton.” 


Moving and Deleting Files 


cp filename newfilename 


Copies a file. 


шу oldfile newfile 


Moves a file. 


rm filename Deletes (removes) a file. 


Be careful when using wildcards to specify groups of files when using 
the rm command. Before using rm use the wildcard with 1s to make 
sure you know which files will be deleted. 
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mkdir dirname 


pico filename 


emacs filename 


Files and Directories: Wharton’s Unix Systems 


Creating Files and Directories 

Creates a new directory. 

Examples: 

mkdir homework 

Creates a new directory homework underneath the current 
directory. 

Creates or edits the file filename using the editor “pico.” 


To exit pico, press Ctrl x. If you’ve modified the file, type y to save 
your changes or n to exit without saving your changes. 


Creates or edits the file filename using the editor “emacs.” 


To exit emacs, press Ctrl x and then Ctrl c. If you’ve modified the 
file and you want to save your changes, type y. To exit without 
saving your changes, press n, and then type yes when asked 
Modified buffers exist; exit anyway? 


For More Information 


Most Unix systems have extensive on-line help, in what is referred 
to as "man pages." To get help on a particular command, enter man 
command, where command is the name of the command. 


For example, to find out about all the options available with the 1s 
command, enter man 1s. To find out more about the pico editor, 
enter man pico. To find out more about man itself, enter man 
man. 


If you need help but are not sure of the command name, enter 
man -k keyword to look for help topics that include the word 
keyword. 


The man program displays help information one screen at a time. 
Press the space bat to display the next screen. To exit the man 
program, press the q key. 
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NetNews Using "tin": 
Quick Reference 


When you're using the tin news reader on Wharton's Unix systems, 
you're typically at one of three places: 


> the newsgroup level, 

> the article directory (also referred to as the “index page”), or 

> within a specific news article. 

Below are some of the commands commonly used at each of these 
levels. As with most Unix commands, case is significant. For 
step-by-step instructions on using tin to read NetNews, see the 
WCIT TechBrief “NetNews Using “tin”: Wharton's Unix Systems." 
Entering and Exiting NetNews 

Starts NetNews using the “tin” news reader. 

Leaves the news reader and returns you to the Unix system 
prompt (%) or the Main Menu. 

At the Newsgroup Level 


Moves you up or down one line through the list of newsgroups to 
select the current newsgroup. (If the down and up arrow keys don’t 
work on your system, you can also use the j and k keys.) 


Moves you down one line (like the down arrow key on most 
systems). 


Moves you up one line (like the up arrow key on most systems). 
Moves you down one screen through the list of newsgroups. 


Moves you down (forward) one screen through the list of newsgroups 
(similar to the space bar). 


Moves you up (back) one screen through the list of newsgroups. 
Displays help screens of the newsgroup level commands. 
Takes you to the article directory for the current newsgroup. 


Subscribes to the current newsgroup. The newsgroup will be 
displayed when you use y to display only subscribed groups. 


Unsubscribes to the current newsgroup. The newsgroup will be | 
displayed when you use y to display all groups. _ 
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k 
Pressing the space bar 
Ctri f 


Ctrl b 
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Pressing Enter (<~) 
q 


Pressing Enter (4—) 


^v 


Ctrl f 


Ctrl b 


Pressing the space bar 


NetNews Using "tin^:. Quick Reference 


Switches the newsgroup listing back and forth between showing all 
the newsgroups and only those to which you have subscribed. 
At the Article Directory Level 


Moves you up or down one line at a time through the article 
directory. (If the down and up arrow keys don't work on your 
system, you can also use the j and k keys.) 


Moves you down one line (like the down arrow key on most 
systems). 


Moves you up one line (like the up arrow key on most systems). 
Moves you down one screen through the list of article titles. 


Moves you down (forward) one screen through the list of articles 
(similar to the space bar). 


Moves you up (back) one screen through the list of arücles. 
Lists the titles of the individuals articles within the current thread. 


Displays help screens of the commands available at the article 
directory (or "index page"). 

Displays the current article. 

Moves you up to the newsgroup level. 


From Within an Article 


Displays the next screen of the current article or, if you're at the end 
of the article, displays the next article. 


Moves you up or down one screen at a time through the current 
article. At the end of the article, takes you to the next article in the 
thread or—if there are no other articles—to the next thread. (If the 
down and up arrow keys don't work on your system, you can also 
use the Ctrl f and Ctrl b keys.) 


Moves you down (forward) one screen through current article (similar 
to the space bar and the down arrow key). 


At the end of the article, takes you to the next article in the thread 


or—if there are no other articles—to the next thread. 


Moves you up (back) one screen through the current article (similar to 
the up arrow key). 


Moves you down one screen through the current article. At the end 
of the article, takes you to the next article in the thread or—if there 
are no other articles—to the next thread. 
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Moves you down one screen through the current article (like the 
space bar). At the end of the article, takes you to the next article in 
the thread or—if there are no other articles—to the next thread. 


Displays help screens of the commands available within an article. 
Moves you up to the article directory level. 


Sends electronic mail to the author of the current article. Includes a 
copy of the current article. 


Sends electronic mail to the author of the current article. 
Mails a copy of the current article to another user. 


Posts an article to the newsgroup in response to the current article. 
Includes a copy of the current article in the new article. 


Posts an article to the newsgroup in response to the current article. 


Posts a news article on a new topic. (To post a reply to an existing 
article, use f or F.) 


Makes a copy of the current article in a Unix file. 
Moves you up to the article directory level. 
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WCIT Consulting 
212 VH 
898-8600 


CRC 


38th & Locust Walk 
898-9085 


Unix Help 


_ VMS Help 


Wharton Computing and Information Technology 
400 Steinberg Hall—Dietrich Hall 


Getting Help 
for Computing Problems 


Walk-in and Telephone “Hotline” Support 


Wharton Computing and Information Technology (WCIT) provides 
computer consultants to assist students, faculty, and staff in using 
Wharton’s computer systems, software, and services. 


» Wharton computer consultants, walk-in help: 212 Vance Hall. 


Standard hours: 9 AM to 12 noon and 1 PM to 5 PM, 
Monday through Thursday; 1 PM to 5 PM Friday. 


Summer hours: 1 PM to 5 PM Monday through Friday. 


» Wharton consulting "hotline": 898-8600. 
Hours: The same as walk-in consulting hours (see above). 


The University's Computing Resource Center (CRC) provides 
consulting for Macintosh and DOS/Windows systems, file-transfer 
services, and virus protection services. | 


> CRC, walk-in help: Locust Walk across from the Bookstore. 
Standard hours: 9 AM to 4:30 PM, weekdays. 


» CRC telephone support: 898-9085. 
Standard hours: 9 AM to 4:30 PM, weekdays. 
Electronic and On-line Help 


Most Wharton Unix systems provide extensive online help through 
"man" pages: 


> Unix "man" Pages: Type man command where command 
is the name of a Unix program or command for more detailed 
information on a particular command. 


Туре man -k keyword to search for information on keyword. 
For more information on using the man command enter man man. 


If you use Wharton's Academic VMS system (host system Wilma"), 
you can get online help for many VMS procedures. 


> VMS Help: Type help at the VMS system prompt (S). 


Many VMS applications have additional online help available from 
within the application. For example, for help using the VMS Mail 
utility, enter help from the Mail prompt: MATL> help 


© Copyright 1991-94 The Wharton School of the University of Pennsylvania 7 
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Consultant E-Mail 


CRC Tutorials 
898-9085 


WCIT TechBriefs 





WCIT User Guides 


Vendor Documentation 


Getting Help for Computing Problems 


You can also use electronic mail (e-mail) to send questions to the 
computer consultants at Wharton or the CRC. 


» Mail to Wharton's consultants: send to username consultant. 


» Mail to CRC consultants: send to username 
crcGal.relay.upenn.edu. 


You should receive an answer by e-mail within a day to two. 


The CRC maintains a library of tutorial software that you can use to 
learn at your own pace. Most tutorials are at the introductory level. 


» CRC Tutorials: 898-9085. 


Computing Documentation 


WCIT TechBriefs are short “how-to” documents that provide 
step-by-step instructions on a single computing topic. Additional 
TechBriefs contain information on computing services and policies. 


> WCIT TechBriefs: 212 Vance Hall. 


WCIT User Guides contain information on selected software and 
systems used at Wharton, including Wharton's data resources such 
as Compustat, CRSP, and Citibase. WCIT User Guides are available 
for reference in Wharton's computer consulting office or can be 
purchased from Wharton Reprographics. 


> WCIT User Guides (Reference): 


Wharton computer consulting, 212 VH (898-8600). 


» WCIT User Guides (Purchase): 
Wharton Reprographics, 400 SH-DH (898-1251). 
The vendor documentation for software installed on Wharton's 


computer systems is available for reference in Wharton's computer 
consulting office. 


> Wharton computer consulting: 212 VH (898-8600) 


The CRC contains a wide selection of vendor documentation, 
primarily for microcomputer software (both MS-DOS and Macintosh). 


> Computing Resource Center: Locust Walk across from the 
Bookstore (898-9095). 
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Third Party Books _ In addition to the “official” manuals that ship with the software, third 
party documentation is available for most popular computer software. 


Two bookstores with a wide selection of computer books are: 
P University Bookstore: 3729 Locust Walk (898-7595). 
P Borders Bookstore: 1727 Walnut Street (568-7400). 


News Announcements _ The following periodicals provide current information on computing 
events and services at the University: 


P Penn Printout 


Information about computing at the University appears regularly in 
Penn Printout. The September issue usually contains an overview of 
University resources for help with computing. Back issues of Penn 
Printout are available from the CRC. 


> The Wharton Journal 


Wharton Computing occasionally publishes news announcements and 
general information through articles in the Wharton Journal. 


Training Classes 


WCIT Short Courses Wharton Computing and Information Technology (WCIT) provides 
400 SH-DH a series of computing “short courses” that offer hands-on computer 
898-2667 training. 


> WCIT Short Courses (Registration): 400 SH-DH (898-2667) 


WCIT computing short courses are free to Wharton students, staff, 
and faculty. Each course is $25 for other members of the University 
community. A $5 cash deposit may be required of Wharton affiliates 
who have previously failed to attend a course. 


For course descriptions, see the WCIT TechBrief, “Computing Short 
Courses: Course Descriptions.” For a list of classes, see the Short 
Course Schedule TechBrief for the current semester. Both documents 
are available from Wharton’s computer consultants in 212 Vance Hall 
or at the WCIT Computing Services window in 400 SH-DH. 


Lippincott At the beginning of each semester the Lippincott Library of the 
898-5924 Wharton School provides one-hour training sessions on library 


research techniques, covering both print and online resources. 

In addition, daily one-hour training sessions introduce users to the 
electronic information services available at Lippincott, including both 
online services and CD-ROM systems. 


> Lippincott Training Classes: 898-5924 
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КС 
898-9085 


The Computing Resource Center offers both formal hands-on training 
courses and the more informal “Bits and Pieces” noontime seminars. 


CRC hands-on courses are three-hour classes on various software 
topics for both the novice and advanced computer user. 


> CRC Training Classes: 898-9085. 


CRC’s “Bits and Pieces” seminars are hour-long sessions held 

at noontime. Most sessions provide a presentation with time for 
discussion. Times and dates of the “Bits and Pieces” seminars are 
published in the Penn Printout. No registration is necessary. 


You can profit from the MIS shortage 
Ranks of computer grads are thinning as students lose interest 


Investment houses go for 
broke hiring IS professionals 


Demand for UNIX C programmers soars 


There is going and will continue to be a severe shortage of college graduates coming into the 
management held. Firms are trying to distinguish themselves in tough global markets 

by offering customers fast, easy access to timely information. so information management is and 
ession-proof priority. In fact, systems analysis is projected as the seventh 

aces growing oceuparions in al whi cola and Hue olar sectors of be economy. Gras hot 














Simpler requires that today's managers understand 

E унын ad Jia cf Compan ditus se M thay ba (н неча им do 
ign and implementation of those information systems, In response to that need the Operations 
tmation Management Department offers a rich set of courses in the area of management 
toledo ате ЛИОКО x опо vk ur adam desde arcana: In contrast to 
‚ programs in computer science - which tend to focus on the development of basic computer 
technology - Information Management emphasizes the business use of technology. The primary 

| jyusiness problems and to empower 











The суноо iiodton the de and mantrine sí Lomo spleens. lt counts as à general 
breadth requirement in the Wharton undergraduate program. This IS NOT à lab course Like 
OPIM 101! No prior technical background is required. Classes focus on readings and cases, 








i 25 mming practices and discusses the management of 
commercial software application development Students gain practical experience using the 





OPIM 314 Computer Mediated Communication: Business, Technology, and Policy 
Students gain both hands-on experience and knowledge of how companies are using the Internet 
to reduce costs, decrease cycle times, and support joint ventures. This course examines how 
companies manipulate, store, communicate and retrieve information to conduct business. 


OPIM 315 Database Manag S 

This course introduces ОКУЗ ксн за ноа of dut апаретепі and . 
retrieval, Students gain practical experience is the development and use of database and 
multimedia systems. 


OPIM 410 Decision Support Systems 
The course presents an overview of expert systems and decision support systems technologies, 


trends, and products. It also explores the development of expert systems and decisio 
systems and prepares students tà use leading-edge tools in this area. 
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Only you can prevent an MIS shortage 
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What's better than being a doctor or an 
engineer? Becoming a computer analyst! 








BY JERSEY GILBERT There are no more dreaded words in corporate America than these: "Ihe 


system is down.” Your boss is screaming, your clients are whining. What can you do? Call a com- 
puter systems analyst, that's what. Systems analysts are the indispensable people who install, cus- 


tomize and supervise computer operations at offices and factories across the nation. And now, 


with their services increasingly in demand, it's 


according to Money's latest ranking of 100 jobs, chosen to 
represent a wide spectrum of pursuits. The Burcau of 
Labor Statistics (BLS) believes there will be 501,000 sys- 
tems analysts jobs created between now and the year 205, 
a gain af 110% from today's 455.000—and that 501,000- 
job forecast represents а 37% upward revinon fram Just 
two years ago. That explosive anticipated growth helped 
propel systems analyst to the top of our chart from No. 31 
ш our previous jobs ranking, published in 1992. (Our com- 
piete listing of 100 jobs appears on pape 72, and а juice їо 
finding à new job when you lose yours is on page 74. Final- 
ly, on page 82, we present an exclusive poll that probe 
your attitudes toward work and enjoyment.) 

Need more proof of oar No. 1 choice? In ber 10 years ai 
the New York Daly News, Jean Leonardi, 36, pictured at 
Hehi has worked her wav up from system programmer to 


TU MONEY = MARCH 1994 


no surprise that they have the best job in America, 


systems manager in charge of the software that prepares 
news copy for the printing press. Along the way, she hits 
seen her salary rise above the industry median of 542. 700 a 
year. Leonardi adds that ber work is satisfying as well. "In 
this job," she says, "people have a problem, you fix it, and 


boom, there's an irumexliate: reward. " 


Amont our other notable findings, doctors scored well 
despite all the talk of drastic bealth-caze reforms. [heu high 
prestige und salaries (median: 5145000) lifted them to io. 2 
an our list, up a notch from No. 3 in 1992. Two other health- 
care professions, however, rose sharply in the rankings 
thanks to the growing tendency to shift medical services 
sway from high-priced MDa physical therapist (Мо. 3, up 
from No. $0) and registered nurse (Na. 28, up from Na. 52) 

Budget cutting at colleges and universities took a toll 
on some af the scientific careers that dominated our st 
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two VSG agn, including biologist (No. 16 1465 year, down 
trom Мо. 1), geologist (No, 8, down from No. 2) and 
mathematician (No. 27, down from No. 4). Electrical and 
cmi engineers, who are not dependent on universities for 
employment, more than held their own, moving up to the 
No. 4 and No. 5 spots from No. 13 and No. 9. respectively. 
Even homemakers are not итше to economic trends 
[he Census Bureau reports recent declines in the percent- 
ape of ecupies who have chikiren under 18 and those with 
only опе wage-earning spouse. Those shifts led us to 
downgrade our estimate of the future demand for home- 
makers. Therefore, they fell to No. 61 from No. 51 
Nonetheless, being a homemaker outranks several less- 
prestiripus paying positions that involve some of the work 
that homemakers do for free, Including cook (Мо. 72), 
wuitress (Мо. 83) and telephone operator (Na. 96). 
Management consultants, on the other hand, pumped to 
No. 17 from Na. 49 as job-cutting corporations increassing- 
ty turn to оше specialists to do work once performed by 
permanent staff, ifs called outsourcing, m bureaucratese 
And for mi the talk of the Clinton Adminmtratiem remmg 
in lobbyists, their high salaries (mediam; $91,500) and per- 
sektent miuence vaulted them to No. 50 from No. 75 


Photograph by Michael Spano 
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SOFTWARE STAR 

Systems analyst Jean 

Leonardi sands bar Trash collectors nosed 
day making core that put taxi drivers for last 
copy gets to the place. Along with caming 
Printing presses ot New Ше lowest prestige raning of 
York's Daily News. апу occupation on our lis 


(through no [fault of Ше 
cwm), SEENON workers shoulder nyc] demands exces. 
ed ony by those of fire fighters and Ermers. But fire fighters 
(No. 89), with madiin pay of $2200. are far better compen- 
sated than sanitation worker ($18 800). Furthermore, the 
public, in ап Opinion survey, says being a farmer (No. 92) 
сте twice a much prestige as being a trash collector 

Chur ranlungs are Бае] on wal you told us was mosi 
important about a job. In a recent survey of 250 Money 
remüers, you rated opportunity for career advancement and 
job security at the top. Next came salary, having а ciean 
and safe workplace = getting è chance to do something + E 
cudly шёапцыш! prestige and шлш stress, We gath 
ered the data for our rankings irom the BLS, the National 
Opinion Research Center, the Jobs Rated Almanac and - 
пине than 150 profesional amd mausit organratnoms. Ж 


Reporter associates: Leslie Marable and Kelly Smith 
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[see the story for details]. This table shows the data we esed to rank each job. In addition, the last column suggests where you mich! 
have the most luck finding a particular job by naming the metro areas with the highest concentration af people је each бен. 1.0. 
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Business analysts trained 
in operations research 
can be a secret weapon 
in a CIO's quest for 
bottom-line results. 
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INFORMATION АСЕ / By ийим M. Bun ry 


Computers Start to Lift U.S. Productivity 
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ноа 9X entirely handled by 
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Never mina 
services— 
the information 
economw is driving 
growth in the 90s. 
For most Americans, 
that translates into 
jobs and prosperity 


BY MICHAEL J. MANDEL 


mies that sets the pace for the rei of the 


posrwur decades, manufacruring was che 
kev to L.S. prosperity, During the 
19808, the driving forces of expansion 
were booming service industries such as 


All told, during chat decade, rhe service 


зесин accounted for pracucally all of the | 


growth in Jobs and corporare profits. 


Ecomomise began to speak of the U.S. ; 
shift fram а rna&nufarturing to a service | 


єс; 
Yer for all the viali of services. mary 


skeptics did nor see how they could | 


make the economy thrive over the long : countries such аз China, Hunger: and 


=== : Thailand are investing heavily in sare- 
int Step Daekward since service jobs | of-the-er communicztions sterns ia Th 
: > | effort to leapfrog their wav tò presperire 
ufscturing and had significantly slower 3 i Pete 


term. [n fact, the shift seemed like a gi- 
paid lower wages on average chan man- 
productiviry growth. Moreover, services 
much harder to export than manufac 
tured goodi The worry was thar if che 


U. 3 lost its manufacturing industries. 
it would have a difficult time selling 


|! CAPACITY A BIG BOOST 
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| enough services abroad ro pay for ie im- 
; рога of сати, consumer clecoonics, and 


| eher goods 


Fear nar: Like adolescence, rhe ser 


Depa. business and don- 


ment—are prospering. And companies 


| | | in every inchutzy are using informacion 
economy. А сепћит agn, the railioads | | 
Were Americas growth engine In the : 


technology ro reengineer themselves and 
become more competitive. [n short "rhe 


rom, a Mobel przewinning economisr at 


| | Stanford University 
health care, legal services, and retailing : 


In this regard, at lezst, the L.S. is 
leading the way fex the rest of the world 
Europe i$ deregulating its relecammu- 
mesnons industry in order to create Jobs 
and stimulate cata Энырыць japan i5 
AVALIDE RT] Ete citom TD naar the 
considerable edge che U. 5. has built 
over rhe decade in personal -camputer 
ang necwork use. Even developing 


America remains way ahead, howrs- 


| : er And irs the place where rhe conse- 
such as medical care and retailing were : к eo 
, REPAVING THE ROAD 

МЕЗ FIBER-OPTIC CABLES 


ARE GIVING TELECOM 


ECONOMY 


| Tale of information is transforming the 3 
: ñature of economy" says Kenneth |. Ar g 
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| vi£e economy has rurmed out to be а |@ 
: temporary sage. Par more than most 
- people realize, economic growth i now 
= being driven noc by services, bur br che 
. Computer, software, and telecommuni- 
: canons industries. Indeed. according to 
the 
| mumer spending on high-tech equipmens 
· accounts for some 389 of economic’ 
; growth since 19650, 
| What's more, government statistics | 
: undemhiav rhe evolution of the informa- 
| Gon economy. Industries that depend 
> ; Such аз financial services and етері 
in every сїй, here 3 à group of indus- | 


= O e 
` 1 





quendes of the new европе are First 
showing up. Tb a large degree, che news 
5 turmine out to be prod. For one thing. 
unike most services. information pred. 
ucts such as saofrware and степе 


produci in the service sector grew 
glow |y, vesent m information tech 


ihe coco: 
Beyond thar, the effec on work is less 


coming low-paid burger.flippen, che 
quintessential job of che service soc. 


er joc cong 
NH. show tirar 
their wages arc on 
thé rise 25 a result 
| For example, carn- 
Ings for male com- 
puter Programmers 
| have risen by 129 
&nce 1990 .com- 
pared with 655 fnr all 
| male workers. For 
female cOmpucer 
programmers, the 
pay gains have been 
| even bigger s 21% 
rie since 1990, va. 
1555 for all female 
workers. 

The drawback is 
that along with the | 
winners, there will 
temporanlyv be [ots 
of loser. Higher pro- 
ducnrviry has led ro 
Dig layoffs at mane 
companies, especial- 
ly in rye telecommu- 
nications industry 
(table, page 261. 
Elsewhere, mesan- 
while, advancing 
technologv is favor- 
ing skilled workers 
over unskilled, in- 
mesang rhe inexual- 
[tv in wege 

For bener or for 
worse, this mznsfor- 
TOM 19 оссштии 
at ап asconishing 
raté. Look at bust 
nes Investment. | 
Measured in inf 
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harmful than once feared, Far imen be- | 


сап bë саву exported. And whereas i 





nd Оп zofrware and pm 

Tete лаш Meanwhile, busi- 
ness spending on industrial machinery, 
whsch traditionally has been the Puts of 


d manufaccuring, has fallen gs. в share pš 


nology is boosting productivity across ; 









| ЁШ: пр и " друг кл" | а o 


equipmenr inv esument from 32% in 
1975 to oniy 189% in 1993 (chari 

Ar the same time, information tech- 
nologv and services are helping to drive 
the continuing export boom. The air- 


| стан industry is-often held up as the 
| shining scar among (7. 8. exponer Ter 





America s а = 
tiles Oi Ln armand 






also the warid hs 
ем ехропе of soft- 
Ware, = баст chat 
doesnt show up in 
ther government's 
numbers. In 1991, 
major 17. 5. software 
companies sold $2 5 
| billion worth of per- 
муља] computer pro- 
| grams in Western 
| Europe, Asia, and 
| Latin &merics, s 
| cording to the 
Soltwure Publishers 
Алап, Microsoft 
Corp. alone derives 
нуле 555b of its теч. 
enues from overseas 
tales. 

The U.S. also is 
running s huge 53 
billion trade surplus 
in computer-elared 
services, such as dam 
processing and infor 


Japan as tO the next 
stare of across the 
hall For example, 
Mead Гага Central 
inc, the company 
that runs the Lexis 
and Nexis services, 
which contain legal 
news and general 
news respectively, 
also has databases on 
French and Brirish 
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in overseas E 
les The l 


maton darabases. | 
Из nearly as cas | 
now to send infor- | 
maton to Europe ог | 


е L.S. н | 
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ine chat lawyers in those coun- | IB id 
eres wee: The locaton of chese | ЖЕ 





















Coming improvements in over- Comune maf бггкен потрае 


зета communicacens will even 
make it possible то export such 
Services 25 medical] сте. By ths 
| coming summer, doctors across 
sparsely populazed Зоштћ Dakon 
will be able to use a statewide 
rejecomimunicstiors nécwuürk 20 
consult with specialis hundreds 
of miles away, [he same exper- 
mse could be талап го Алап 
or Latin Amenca just as essi 
“The information economy can 
breed a healthy economy because 
а loc of 103 services are expor- 
able.” savs George Bennerz, chair- 
man ef Symmerrix. a technology 
consulung firm. 

Two other positive byproducts 
of the laformauon Age are 
grester елити and lower prie- 
єз. During much of the 1984/5, 
схо worned char thev 
could noc find any impact of 


computers On producriviry Bor mere me- : 
Cant ciio shows thar investmenc in | 
cnmpusers afte worthwhile, Eecnomist | 
Erik Brynjolfsson and Loan Hie of the | 
Massachusers шине of Technology : 
surveyed 400 large compamies m gauge | 
the effect af technology on output рег : 
employes. They found that the retum : measure. Take the communications st- 
: ms whieh includes the relephone, Бира 

casting, and cable industries. According - 


an invesrmenr in сатана нет 


EXceeded 3096. "And mos of these ben- ` 
‚ излей] on to consumer | 
in the form of lower prices" mvs Biren | 


eft аге bet 


lisa 
in fact, the productiviry surge of the 


last пио vears— when nonfarm output = 


per worker rose by 4,95, ил biggesc two- 
маг jump since |9ST6-—nav reflect che 


full advantage of the huge 
sums they've spent pui- 
chasing information rechnol- | «sz 
ogy "IF T put technology in. | 7 
and nothing changes, and | EP 
then later a business gets in | Б 
a crunch and dincevers thar 
it сап cut out all the middie | E 
management, whar made it | 
possible?” asks Raymond | 
Perry, chief informanon of- | 9 
ficer sr Avon Produce: Inc. 
| "Well probably the technol- | BENE 
cay did. Its just that we |+ Ж 
WEED т теу to tke rhe 129 
people out unti ú laer paint 
in ome. 
Even Che recent produt- 
vire numbers potan far | 
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to government figures, communication: 
Only 3.1% of the economy, up from 


| 28% in 1984, ar the ame of che АТАТ 
j divestiture. Over the same penod, min- 
utes of telephone use—a key number : 
| truckexdd by rhe. Fedden! Coenrmunieations 
j і Commuson—hħas grewn only неу ` 
effors of U. S. companies m finally mke | `i oe 


faster than the overall eeonomx 





а: = 


i 
ч” = 


гбг: 





officia! srartsrics ignore many of 
che changes pf the past decade. 
For one, a much greater percent- 
age of the calls over the phone 
merwerk are faxes and computer 
cam going bach and forth, raster 
than peopie wiking. As much аз 
109 to 20% of che traffic across 
се ATAT long-distence nezwork 
may be date, estimates Frank 
linna, the company's general 
manager for nerwork services. 
Thart up from 7% to 10% а few 
wean ago, And because of time 
and languese differences, about 
hall of inmernational calls are dara, 
nor voice. 

These fax and computer mes- 
pack s kot more dara пио a 
minute than they used tà. Over 
the past few years, for example, 
Ше speed of a typical modem— 
Which їз used ro manse informa: 
поп berween computes Ver 
gene lines—has quadrupled. 

hat mesns the amount of infor- 


- : mation being pumped through rhe sys- 
- tem Баз gone through the roof. The 
= point m this: If the ouput of che com- 


‚жесш ге in tems 


- of dare transferred instesd of che num- 


ber of тїшє it's in use, iz would show 


г far mote dramatic growth thm the pub- 


lmhed numbers indicate: 
Price in che communications sector 
have abo likely fallen much more sharp- 


= Ју chan the government numbers show. 
- According to the Bureau of Labor Saitis- 
| tts, the producer price index for inter- 


state telephone service has nsen by 
245 ever the past five verns. Yet this 
figure doesn't take moo account the dis 
count calling plana that most long-dis- 
tance compares now offer. Мог does it 
. adeguarely track che cost 
and use of leased fines. The 
BLS hopes m remedy some 
| of these oroblems with à 
new index for telephone 
prices, perhaps by January 
| The information econo 
у] | my 20100 has a much larger 
ЖЕЙ podiv: apaiy than rhe 
current government satis 
{| tics indicate. For the mo- 
| ment, the main measure of 
how close the economy із 


TELE-SELLATHON 
HOME SHOPPING 
CLUB HAS ADDED 
THOUSANDS OF | 
JOBS IN NINE YEARS 








со Ez maximum operating тате is the 
Federal Reserve industrial сараску uril- 
gaban number While thes includes uel 
ities that sell electricity and naniral gus, 
it leaves out telecommunicznuona. T hat 
means chere 15 no good mesure ПЁ rhe 
amount of spare capacity in the U. 5. 
Elecom Пил ~ Tha ts ап WTLEROETATLI 
omussmon, засе many businesses have 
béscome increasingly dependent an re- 
Habie—and widely available—tommu- 
icon services 


Even che investment boom of the раз : 


few vears undersmines the true value of 
che spending on informacion rechnolo- 
жү. According to Commerce Dent. fis- 
DES, invéesrmenr m ocommuniTc43TIDTHE 
equipment has barely risen since 1990 
What these number don't sav is char 


for the вате price, companies have boen | 
able to buv vasiy more sophisticared | 


verhis рат and oher celecormuni- 
canons equipmenr, wich new capabil- 
ries such as call forwarding. 

Bevond those hidden by rhe mer- 
urememnt problems, there are some fun- 
damental differences between the in 
immani eennam;v amd ic predeoessom 
In the рая; ceehinolozical improvements 
such as rzilroads. auto plants, ond ste 
mulli required vast amounts of capita! 


a mam — 


J Р 
Ра _': 
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HONK IF YOU'RE ON-LINE 

AS TELECOMMIUNICATIONS 
IMPROVE IN RURAL AREAS, 
URBAN TIE-UPS COULD GO 

THE WAT OF THE EDSEL 


Bur because che price of information 
rechnolary continues to drop so auick- 


iy, companies can spend less го рег; 


T Ж 


healthy improvemenr:s in productriry 


| and quality Indeed, in recent years, the 





productivity af caprcal—defined as che 
атыш: Gf output produced per dolla; 


: of Ошур end equipment—has крй up 


for he first ime 1n the poscwar era. “As 
the LS. bomes an informanan-anent- 
ed economy, says Wiliam Sterling, an 
economen ar Merrill Lynch & Co, “ттш 
mav have bess need for capital chan vou 
have in che раве." 

For exampie, phone companies are 


able to boost the carrying capacity of | 


deir exisong fiber-optic cables by simply 
upgrading the electronics ar either end. 
That means they can add ro capacity 


-Without faving to go through rhe ex 


pensive process of сумње up old cables 
and inline new omes 

Even connecting all of the nation’s 
hermes to the Information Superhigh- 
wav ruv consc less than expeczd. In Cal 





Homin Расте Telesis Grouse and ATAT 
ane estimating thar ic will cose an aver- 
аре of S500 co wire each of L5 million 
hores with a combined fiberoptic oo 
ах сагне nerwork thee can сапт the 
most advanced services. That compares 
with 51.900 for rhe elecrronics and la- 
bor needed to run a fiber cable all the 
way to the home. "The fiber-only erri- 
mumes were scanng everybody off,” sys 
Robert Clark, vice-president for marker 
ing and sales at АТАТ Network 574 
tems "We ve been able ra sec anoche: 
wav af getting ali the services." 

If these lówer estimarea tom aut ga 
be right. it won't come às à total sur- 
резе От я comparable basis, the росе 
of information-technolory equipment 
has dropped by 2395 over the past free 
vears, according го Commerce Dept 
number The aend, if it eannnues, will 
have important umplicanons for interest | 
races. 1f companies need to borrow fess | 
more ro finance their investment in 
High-tech equipment, thar will keep 
overall razes lower than chev would have 
been оне пате And chat will benefit 
homeowners. the government and och. 
er bocrowern 

Soll there's rhe recess 


' of chose FA- 
ets Tram the shi 


t tn thé information 
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ты 19054 Т? 
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denomi Ar the тор of the list tre the : eun are 


wheres eh and апаа ed logical 
panies reenpineer their businesses. The games, for 
reduction in saf сал be enormous, Ac | 


driven mainly by chno- : c 
| do not use еб DUET avi 
exampie, do with fewer oper | : 
ators, maintenance people, end ocher | Dene: -Them ii ver strong evidence 


ues in similar есир ations whe | 
arene 
icf economist ar che Labor 






cam higher wages. 


USAir Inc. 630 people were onge need- | lal phone эмие in New Барыш eben te | thar people who work with computers 


ed in the revenue accounting depar- | local рема Бегу 


ment, ow that much 
of the process has been 
aurarmared, only “pe 


др eec szvs ‘Sen. 
ior Vice-President and 
CFO Jahn W. Harper. 
And at many cor 
nies, the downsi 
isn't over "Where w 
all these 
loved? Lester 
| lThursow, an A освојен 
professor аг MIT and 
former dean of rie unb 


versity s Stoan School ОГ ФИ т: 


i ОЕТ 









реше pressure pla a tole. but these |: 


= | 


hich ше at the 


T" rural South 





Dakota schools can 





eS 





= орао ена 
: Ome Indeed, recent studies suggest а | 
| ilg peso dot workers who fee at.: 


| Galt ewen i Joni people die kada 
| left behind, the information economy 
| још: For example, the Home | 
ene AU simi | 






d has "andy. ple, up fom папи 
пиле č spectrum are | Ah 
dropped by 154.000 | | аъ SandPoint Corp, a Cambridge 
since 1988, with more (Mast) maker of softwaré that helps 
Cuts to come i people track Gown information in dat- 
Also at sea in the не | bases. Over che pas vear, SandPoinr has 
économ* ane ^ леп fram 15 то 32 employees, and irs 
unskilled workers and | still expanding. Overall, rhe number of 
; jobs in the software, data processing, 
and information rerrieval mdussries has 
rsen bv 31% Hace 1988. and these in- 
dusiries now employ more people chan 
| "E MU Viae | 
| economy may even make it a bit easier | 


Вст разг 


SE more than 


| ine in June. 1993, observes 


— 


| unemploóvment. 





to march workers to exis- 
ing jobs. The Online Ce- 
peer entet based in Irdi- 
апар, provides job and 
resume [ings on rhe 
Interner. Sinee it went on- 


Director William Warren, it 
has become one af che 
most popula: databases an 
the ssim, wich 13,000 по 
14000 inb openings listed 
and nearlv as many ir- 
sume Лете, пасо 
wide listing services such 
23 this could make labor 
makes more efficient and 






The efecti of the information econo 
пту айс even reaching into rura] areas by 
shifting development away from боп. 


gered urban regions. With more and : 


moe parts of rhe counrrv having access 
to high-capacity telecommunicationt, 


ness “What telecomrmmunicatinns allows 






help jower | 
| UNLIKE SERVICES. 





NEW ERA FOR EXPORTS! 


AMERICAN COMPUTERS 


| AND SOFTWARE CAN BE 


SOLD EASILY OVERSEAS 


business relocarson firm. 
Technological advances will have an 


і even more profound impact on che vial- 





ing big-city services and 
amenities to small tows. 
For example, the telecom- 
mtrkstinit necwork opes. 
ated by che stat of Sowth 
Dakot enables rural 
schools to offer Spanish 
Classes via Interactive TV— 
Something thev would 
never have been able to 
da an their ern. The in- 
formation revolution, sas 
Sourh Dakota Governor 
Wilker D Miller, “is goi 
of change the face б 





: South Dakotu as much as пали elecuif- 
: Caton: did" 


Thats an ape parallel. Jusces che U. ©. 


| economy today would be unthinksbie 
| wirhour elecuicin, so will tomorrow x 
: eonómv be spurred bv rhe fret flow of 
vou to do is pur the пейт facilines with | 
: tha nghr labor" nuces Ken kuhl. à cem: 
companies cn now pur jobs auch gs or. | 
der-tuxing in remote locariohs withour | 
losing touch with the rest of che busi- | 


informanon. Judging bv rhe explosive 


: growth of information technology so far 
sultant with Moran, Stahl & Bover а | 


the Juice is only szxrring из flens; 


г With [nz Sager гт Муш Yor’, Homara 
Glen it Пантер, aad bDbaurrsu 


irv af rural areas bv bring 
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Semiconductors 





ludon has been built on tana—ailicon- 


5 тип up арзі 


based chips, thar ix. Each year, by cram ` 2e 
mung more circuits on а Memory ór : 
| Microprocessor chip, engineers have 
| made computing technology cheaper, | 
| more plentiful, and more adapeable to | 
new uses Without those annual im: | 
provements, che spread of informacion | 
technology would slow, and new appli- : 


; pipelines chat carry 
t real 





Nothing is final. Better mousetraps | 
Arx come along all the tin not i 
one neat package, necessarily, but bit by 
би. Heres the shape of things to come: The 
10 critical technologies of tomorrow 





underseand whoever vau sry for cx- 
zmplé—mijht not be practical. 
Until recenriy, scienuszs had nagging 


| doubs about the fumure of silicon. Thes : 
1 feared thar by around 2000, they would 
: below which silicon стала couldn't 

> Work. [hat would require shifting to 
; Other, more costly materials Bur char | 
| py es su limit has now been lifi- | 
| е new resexch ar АТАТ Bell Labo- | nenria Th in processing и 

| mones, Hirschi, NEC, Toshiba, and оф : eh deg a Aaa 
| er laba. “As far as we can sec," says Paul 
: M. Horn, IBM s director of silicon tech- 
| nology research, “there's no science lim- 
| it for the next 30 vean." 

| la ees words, chipmakers will con- ` 
| tinue doi hat _ 

For 20 years, the Information Reve : ve Re TER 


they have always 


scoop our everti- 
fier menches for he 





dara around silicon 
сиште. The 
more — plumbing | 
that's buned in ach | 
silicon plot, the | 
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PUSHING THE STATE OF THE ART 


. аана. 


CUTTING EDGE 





; ROTI engineer can erect for crunch- 
; tog numbers, routing telecommunics- 
| dons rrafhe, or presenting a friendly face 
: DD users. Smaller re thus always berer— 


yet never good enough 
The progress so far has been astound- 


ing. When Intel Corp. developed the 

: mucroprocessor in 1971, the vear after ir 
| invenosd the dynamic random-access 
: memory (DRAM) chip, the state of the 
: Chipmaker art enabled engineers m lay 
: down lines that were 6.5 microns (mil- 
: Вовгія of an inch) wide. That vielded 
57 Ф | 2d таме EAT ona chip the size of a 

| | thumbtac ёт hips b | 

RCM DOMIN RAN | cack. Memory chips held 1,024 
н асили wexther forera, imer- | 
acuve ТУ, of compucers that : 


bits, and microprocessors were capable 
af slowly crunching eight bici of dara ar 
а ume. Seven technology generadons 
lane, chipmakers аге eching circuis thar 


| are just 0.5 microns across, making Tt 


possible ro crim up zo 35 million tran 
smn o0 а chip. The msue DRAMs cher 
cun store 16 million bits of dara and 54- 
bit microprocessor that are 550 times 
аз powertul as the first Intel chip, ог 


: about the speed of s 1986-vin [BM 


What's in smee? A continuing expo- 






asc capacity: In reducing line width 
v 207%, the number of posuble mansis- 
пз босап у just dowble—irt jumps more 





: than tenfold. And as the circuire pets 


more dense, you get another perfor- 
mane boosc With everthing serinched 


ім 
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closer together, HEnail can zip mare 


| ашку berween иїшидиюгу, so the mi- 









Chaise ILS 3 пепел, of clock ipee. 


can be increased. Justa couple of years : 


250, rhe microprocessors ій most per- 
sona] computers гап ar 25 megaheru. 
Now, Inte! Pentium chips run st 100 
mhz. while Снра! Equipment Comp. 5 
пре ин Alpha chip силе at 153) miz 
And for reduced msrzucnon-set eomput- 


ing (RISC) chips such as Alpha or the | 
| date into oo at 


[BM /Mocorats PowerPC, speeds could 


| 
- эсеп and rechnülo- | 


; 1 ТЕ 


: biggen difference in 


: Sophisticared lasers 


tunsform — elecurical | from printers to high- 
; fepretentaunas of aon- "^ " »u » mi | Capacity nerwürke А: 
versanons, faxes, ОГ | даи шш the Маха! Research 


hit 400 maz by decade's end and 300 | 


plus mhz in 2002 

For the foreseeable future, at least, it 
seems thar silipin will be able to ruth 
rhe needs of an rmcreasinglv demanidin 
marker. We'll move from todays 16- 
megabit DRAM chips to 64 megabits, 


and then, around 1999, to 256. And the : 


| their destinations, where they are con- | 
| мете back mio ејосшлету Today's most. | 
ng | capable fiber wep 


first next-centurv memory chips will : 


оге an incredibile one billion bits af : 
data-—eight such chips could hold the : 
Encyclopedüts Biter. Applying this : 


chipmaking technology to microproces- | 


sors will weld chips as powerful zs an 
entrv-level Crav 5 supercomputer from 
(ray Research inc But the chip will 
eost a few hundred dollars, not a few 
milion. So whagver пика pez dreamed 


up for umorrow's computers, silicon has 
the homepower. 
By Ов: Part im New 



















Optoelectronics 


Call it the invisible backbone of che : 


Information Age Every melody from а 


this undering technology, 
wouldnt be an in formation infrastmc- 
rure" avs Edmond |. Готи execu- 


г impossible to build. Thars 
: why the LI. S. Opeseloczron- — 
: cx Indusery Deveiopmem Assn. = 


№ ; 
: billion today—a«nd more than i 
: S200 billion within a decade. = 
: opielecunnic envelope. By record: 

| so-called blue lasers with shorter wave- ` 


| lengrhs to 


гиге 
г now predict they'll uc able 35 pack per- 
| haps 18 trillion bita of dacs on а single 


| ability to store floods of digitized video 


; veloped by startup Phomnies Research : 
: Ine. in Longmont, Colo. is fashioning 


| sens and optical кецейе, they re pump- 
| дн nite nie ie | 


; for із wired wt diving the cost down— | 
| CD player, every vorte over longdir : 
tance. phone lines, each page from abi- ; 
ser Printer comet to vou courtesy ofa : 
felicitous marriage of light and elecrric-. | 
try known as optoċlėcmonics. Мош: ; 


“there | 

























tive viez-pressdent foi entire arrays of hun- 
dreds of lasers on a 
mgie water, inscead of 
individual — devices. 
Such arrays not only 
cut costs greatly but 
mance in everything 


Бу at Мупех Corp. 
Fuh? now, fiber op 


making- che 


i:eleeommunicatigns. 





| Laboratory, research- 
ers have etched pat- 
ems directly onto op- 
| tical fibers as the glass is being made. | 
| The result Б an inexpensree opacal sr- 
| sor that can be threaded chrowgh plane 
= wings of ikvscrmper walls ro warn al 
ари | suEsses and strains With such advances 
Thats а 10.008-foid mmprovementr on : on the horror, the role of fiber optic: 
copper = will became anything bur invisible. 
The total market for optoelectronic : By Jata Carry т тт Wutrktesron 
component 1s relanvely smàil, mow con- 
isting mosti of flat-panel displass in 
laptop computer, But without 
components such as a 55 laser 
ina CD mayer, ennie 
nes of products would be 


Milk eS LE ER 


light, which speed эЛ TES Elan 


through dhe fiber to 


hare lines ean Gurr a 


ш 





(OIDA) says dye technology is re- , 
sponsible for markers wath 550 . = w" 


— m 
Researcher Gantinue to push the. — 


ing infomation as holograms or wing 





to feed Mote tity "poc опа: 
disk, scientiscs miy creare devices with | 
unheard-of Scientists ` 


Parallel Processing 


You and 10 fnends could paint your 
house a lot Easter than you could painr it 

| by yourself, nghe H you undemnand the 
| logie of teaming up to compress the 
time it takes to pet a job dene, vou've 
ond and figure on hiring | already got the basic idea behind рег 
What this could méan is an | allel processing, the Кет technology 
explosion of low-ongs, high- ' pushing the power curve an upcoming 
speed communicapons cupacity snd che : sencranons of large-scale computers. 

Thé appetice [or computer power те 
| mains boundless. Today, the digitaliza- 
rhighwsy Bur there's | tion char is converung TV, movies, mag- 
uch of the current cf- | azines, and phone calls into the ls ae 
is of computer is рагу 
demariis on compunng hardware. Even 
. the most powerful supercomputer pro- 
| eessor alone is no macch for the job. 
| Which wiry pamilet processing will be 
г а critical cechnology in making muhi 
· media and rhe Information Superhigh- 
way move Tram туре по realm. 
In parallel processing. computer archi- | 


anch platter (page 63). With new ја 


= 


and sound—ewo key components of the 
Informanon Su 
а amie cost. “М 


In some Cases, down сео orders of : 
nitude [by a fector of 1001," saya Deven 
Hartman, head of an орка) intercon- 
nection research group ar Motorola Inc. 
Breakthroughs in manufacturing 
would help. Ове розу, being de- 
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| tects achieve 5 Super н зї ев 


| speeds Бї aking togerhe: 
| anvwhere from a couple го 


т | ELA 


many hundred processors and pro- 
g diem пр work in concen. Soft- 


garmin 
ware divvies up sks amon all the pre : 
cessors—much as you might ask Ed to | 


paint the porch while Sue starts on the 


window frames. The trouble is, coordi- | 
nating those pracessors—making sure : 
char rhe night processor gen the right ; 
paece of intorimatnion st the night umé— | 


| m cough. 


But peopess is being made. Just a few | 
years age, parallel processing was the | 
province of степ and engineers, 


Nos it 18 moving ito the mainstream, 
у for applications thar require 
huge databases Wall Sree firms 


rure, parallel : diginzed movie takes abour 2 billion | 
machines may be the oniv Toba of : P 


pensona- | 
; computer hard disk stores only 210 | 


| The | answer ver If mune echnical | 


i improvements soy on track, by 1996 и | 
ill be common to have inexpenstve | 
| hard-disk drives with a billion bes of | 
: capacitv m desktop . 

: PCa After chat, chere 
switching to parallel computers, woo "Ar : 
least in advanced computing,” save in- : 





putem, mainframes, or high- 


end ‚ servers that survive, TBM 
recently brought out irs firs: parallel-ar- 





ји рата ђе! инн for or engineering. | 


Cher major computer maker—nctu 
ing Unisvs, Hewlett-Packard, Silicon 
Graphics, and Sun M 





dusrry analyst Сагу Smaby of the Sma- 
rint parallel 


r> hve veas." 
Eventually, parallel co 
mue ГО Bue 





mpuring will 
k SOT QE- 


| Бапа ита ace usine nemoka of work- 
| STRUDTU to simulate one giant parallel 


speech recap. 
| Corp. 


= 


machine. running bi; 
Of Crunching science equations when 
ther are Шей ar night. And single PCs 


with demanding applications, such as 
ш Баки 


1 single . Thar holds out the pome 
ше of an ШИЛ ЦЕ technological оху- 


weird, bur И could give compurerns the 


tion ehurning. 
By игш ойн im Saw Frango 


| sensitive 

ocessnE in the nex three | 
| called mapnatetesis- | 
| tance. (MR) to pack 
| dam mare dense on 
| the disk. WR—so fur 


darabase programs - 
| of storage capacus | 
| mprovements see | 
тау need parallel power to keep up | 


r muinple microprooessors on : 


ELIN G 











Storage 


Can conventional disk drives—the de- 





formation Superhighway? Afmer all s sim 
to store, end the сура] 


wil be an ехша boost 
in bytes from ultra- 
heads that use am | 
elzcuical 


made only bv IBM— 
has doubled che pace 


1997 Now most спис | 


| makers ше experimenting with “punt 
inetr at [аса] | 
хат there gre no hardware barners - 


MR." which could boost storage density 
thirtviold bv die year 2009, 
Mechanics ind macenals are improv- 


: ing, too. More precise ponnonng mt 
; anisms Will allow heads co {ТУ closer to 
morón—4 dingle/multinencesion Sounds | the disk surface tò read denser баш. | “The 
| Drivemakers such as зеље Technolo- 

power то keep the Information Revolu- : у [ac are ying out ceramic and glass | 


dics. which ire - harder ind flamer ki 


moc ean fit in one drive disk rack 





TECHN O L O G Y 


Mong mulrimedia images, 2s well аз | 


: sprecadihecu amd e-mail, presents ipe- 
: ital challenges. As drives warm up, they 
; тшер dara fiori briefly te adjust pe 
: cording-head 


inom *obodv noos 
such delavs wich convenoonal dara. Bur 


: memiping continuus sur=arms of video 
: means blurry pictures, So Quantum 
‚ Corp and Hewlett-Packard Co. have 


| по беја. 


 Duskmakenr have other tricks, mo. 
Faser controller chins and а new digi! 


| formarung scheme will team up to move 


may come from multiple drives in one 


vices are popular in dat-processing 
" shops and can economically store 
the daca needed for msrant videc-an- 


- етапа. 


Hack on your desktop compurer. big 
truprovermentcs im CD-ROM drrves are in 
the works. First, Japanese component 


| makers are working оп lighter laser 
; ‘Woes that have been keeping daia on | 
= masnenc plamen for the рих Д) vesn.— 
e up with: che demands of the In- - 


herds that can be moved more quickly 
across a disk. An even bigger boost will 
come from spinning disks Зи ern- 
zhle a CD-ROM ro handle up to six 
питне the 150 kilobvres a second che bi- 


sers using narrow blue beams to read 
denser dar than red lasers can handle 


(page Sl). 


to prove char cir technology 
of life bef aay IBM Vice President 
Christopher Bajorek- 
"We're probably 
decades away from 
any fundamental ob- 
szacies thar would in- 
| hibit the program of 
these eechnologies " 
Indeed, even in the 
ape of video-on-de- 
| mand, disks will re- 
main the workhorses 
of the Information 
Revolution. And oth- 
er forms of stomge 
wont РО пау, either 
When a customer or- 


| ders up a movie, it will be copied from 
| disk amery anm | 
| tors. When the movie isn't likely m be 


seTriconduc- 
vimwed often, tt will be erased from the 


с disk and transferred to tape, Predicts 


Dataquest Inc. analyse J. Philip Devin: 
whole storage industry is ой to 


ac 


| sic CD chams out now. Also on rap: la- | 


| 


thrive." More imporrant, so should irs | 


Customers. 
Revs Rer БА Hof iv San Fremin, шй 
Wer! Gron їп То 
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Object Programming 


. When “object technology" emerged 
| in the 19805. expers hailed ir as a wav 


department: caught up on their back- 


logs. That's a peny dundang order fora = 


| ei Јер Сыргага а Now, exper аге 


| ‘Something more: "They re fig : 

uring that object sofware can рү a piv- | | 
| competitor, “Object po- А Ф 
| gramming i» an enabling V 





| otal role in making incredibly complex 
information necrotic manageable and 
и bbe 












The idea behind објест cechnolom is 
to break computer programs up пио 


beca from other progmms or purchase 
chem, rather than reinvenang the where 
every time. To add functions, they can 
amply add mew оћреста. 


A huge potential ратй from object | 
technology sems fom binding peogram- ; 





6b ENTE WES THE PO T2062 RÉSOLUTION 1924 


- information an 


: the gross national product. of 
| во muke pengramrmers mone procuctive, `: | 
Speeding new applications to market - 
and fimally Betting corporate computer = 






THE OBJECTIVES OF OBJECT PROGRAMS 
Deme by ing pret Мо 3 
be Mix and mazch objects to cusemize — 





| mung and dara. rather than separating 
; them as in Conventional programming. | 
| So an object called "cusromer X^ wouid | 
1 include not only all dura abour the cus- 
; Tomer burt also some compurer code for 


way it can respond when an object 


co find a suitable marerial 
Аз information nerworks spread and 


| má more special-purpose objec sre ` 


treated, all the campurers on the ner- 


= WOK fain new powen--—pcowezs char will : 
make it far simpler for people w find | 
do business clecrron- : 


ка ћу. А seamher object asked to find 


Pero, would relentlessly 
SOO rhe пески} tinal 
и сите toan object char | 
fame technique might | 
mike it æy for a chief 
executive w ейп impor- 
саги facts abour s client pr 





software char 






Compurer Inc. Be- 





ever steer maker 





Software makers are beginning their 


crosoft and Taligent, a joint venrure of 
IBM, Apple Computer, and Hewlet- 


xl sometime 


пу obyect-oriented software.” savs Mike 


| tasks Each functon—wining 


amount of informn- | 
thon Gut there." says | 
Rick Jackson, direc- | 
tor ol producr mar- | 


keting at №ехт | 


= " have to work our a | 

| smndard way of managing all of the ob- | 
. есе on a necwrnik. 
nest packages, called objectz, which | 
chen serve as building blocks for larger | 
| programs. With programs made of ob- | 
pacts, programmen are free co borrow : 





; Powel, vice-president for technical des 
; мејортети ar Taligent. 


Why Mar onlv can the mew tehnolo- 


: Ey speed you to conventional comput- 


Er information—documents, report, 
dacabasei, and $0 on—bue it zs far more 


practical for managing the pictures, vid- | 


- &, and sound of multimedia, all of 


which can become objects. Another 


наны peered реи донй: 
| nare the need to shit from one program 
, called “marketing survey." asics for daz | 
| an customers. And these обест can also 
| rétardant pisstic might be able to send | 
| am object around a network to mlk то: 
| ather objècts ar research bibs in Califor- 


! nis 


to another when «eu want to switch 
or calculat- 
ing, say—would be an object, Бе a 
hammer of 3 screwdriver, thar vou would 
pick up as needed. That will erate the 
rcr rp ihanne informacion among them. 


: Thr in itself would be a revolutian- 
| computers Cher work the wav we da, in 
stead of requiring us tà learn their ar- | 


сате wats 
By Fre Guter! in Nee Teri 


Agents & Artificial Life 


Whar is life? Scientists and phikso- 





phen through rhe ages have cried т. 


ewer that quespon by listing what seems 


; tp be unique oo living organisms. Liw- 
fore object software | 


ing things ingest food, for example. 


| grow; reproduce, and finally die 


“Ow consider a computer. Can vou 


стеже а program chat embodies all of | 


rhe definitions of “slive”? Researchers 
in rhe Беја oF artificial Ше are бой jusr 


j | that Their ерости could lav the ground- 
more tb object programming, Boch Mi- | 


wak fara new er of amaizingh: useful 


: information svitems Acmas the world, 
- programmers are creating аге or 
| Packard, ate developing object-based | sam 
- Dperating systems, | 
| next year Rather than slowly adopting | 
. abjecr-peogramming techniques, che im | 
dustry "should make a complete change : 


isms trom che primordial soup of dig- 


| teal bisa chat pulse through silicon chips 


They float invisibly acrots the- seas of 
computer networkz, feeding on data, 
manng (passing on the characteristic: of 
boih purenrs ro their offspring), grow- 
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ing. learning, evolving, and even dyinz | 

when their шу hay passed, | programs from artificial life, too. Rod- 
| Already, che techniques being learned | nev Brooks, a professor ar MIT, is pr 
| by Actifers are changing the way pro | 
| gammes build sofware. Rather chan ! an anc. 50 they wander freely, feeling 
wriüng programs, A-life researchers un- their wey around obstacles ot, if bl | 
leash "generic algorithms," strings of 
computer code that automatically gen- | 
erant new code and can combine like 


Conventional robots are geming new 








pan аге working on programs thar cin 


around rhe globe 
perform their casks 
on idle computers 
after business 


are also exploring 







SOFTWARE “AGENTS" may be able to act 
SOFTWARE CODE thar womatically evolves 






of 


COMPLEX COMPUTER SIMULATIONS 
ROBOTS programmed to mimic the simple - 
reasoning af insects may "learn" to find their way 






Simulation безет, 
m which agents 
läm со interet 
Just в an шу cobo- 






















genes m an бї ыш. They may evolve ; 


havior than an individual ant, Langton 
tough random “mutations” induced | 


hopes software swarms will cake on 
mone complex behurviors. 
Can such programs be considered 


spring are killed off | “alive? The debate гарез. One defini- | explo 
Researcher believe such programs are | боп of alive, offered at an A-life confer. | 
che most efficient way to solve problems | ence: “char which dies when vou н 
with a huge number of variables. “If you | ón ir" Alive or not, A-life could soin E 
| Can nm evolution loost on these -| become a tool in our lives. | 


| lema, you might find solutions You : 
sth .. n n 








Bt] II 
tute of Technology, researchers are 73 
building genetic algorithms char шу `. 
learn по schedule hundreds of process- ==. 
c5 iñ a factory. Supercomputer maker | 
Thinking Machines Corp. i$ testing è ` 
genene algorithm program called Ѕгар- : 
(тепе to sift chousends of pieces of dam : 
on millions of credir-card users to DHE- = 
dix how che cards will be used. Several | | 
researchers ae numunrg Alife programa Е 
шз predict swings m the stock marker | 
software "agente" hands little program | 
that someday will sit in your compares | 
and assist with cesks such zs sch uling | 
meetings Of bcreening e-mail. Partie | 
Miass, a resesreher ar Massachusems [ne : 
surute of Technologvs Medis Lab. is | 
working on agano called “ауына” nr ` 
nobom." An e-mail sošrbor might spor 
parten in the way wx screen vour e- : 
maid and encode the routine in soft- 
Gn top. 





Speech Recognition 


Wouldn't ic be great to deal with a 
compter en Your own remm-—«v, by 
talking со ir? T har has been rhe d 
of computer scientist for decades. 

Continued’, speakerindtpendenr 
speech recognirion-—where you could 
walk up tó arre computer and have it do 
your bidding, jum like on Sep Tia gall 





SÉ DUETS WEE THE had BEVONUTION Yapa 


TECHNOLOG Y 


E remains elusrve. Today, software char 


gramming robom with the instincts of | 
: understand your “yes” 


1 retracing their steps. Researchers in је ` 
| , program your VCR But prosm 
travel the Internet, following the sunset | | 


whole populations ` 
programs. Lang- | 
ton has a Swarm : 
| manad for every speaker. And umaguna- 
; Ove sofrware tricks, such as using con- 
| бахт to predict the next word in a given 
nv seems ro или | 
Mave позе ел be- : 
| Savi spe 


help compurem recognise any persons 
vaulce—4peker indenendence—4s lim- 


ied in vocabulary. Computers сап 


you're aked if you wish ro accepr 
es lor а collect call. They can 


with 
big vocabularies, such os Dragon Syr- 


| tems Inc.'s 60,000-word dictation pack- 
| њег, mun be “mained” for each user and 
: Ward. 


cexr Wore has made an immature tech- 
mology mature for some applications," 
h researcher Kai-Fu Lee. Ap- 


; ple Compur Inc's director of interac- 
| Ove meda 


By Rickard Brome im San Francis | ionic 
| zs whens speakers voice rises to signal 





enable computers из dissem inflection. 


thie comp for quotes on mu- 
ping in codes on a Touch-Tone 
phone. 
|, When will speech recognition 
current raté of progress, research. 
er say if will be another decade 


; before tpesch peeognicion replaces the 
| Pins are for most uses. Until then, irs | 
| unlikelv that the Information Revolu- | 
| Чоп will reach all citizens. “The only 
| way thar’: going co happen is for com- 


| Fuser tp learn to understand what peo | 


штит. : 


- Money for speech research. 





oficial who dixmibures 


. ple tav," says George R Doddington, | 


some 
Aho korwa? 
A few kind words, and computers might 
really Отсо personal. 


Ne ee Á 
ЧЕ 


Бу ery МЕ Алти im Басры 

















Vins 


The promises of the wireless com- | 


murnacstions revolution are vase but con | 
be surrumed up easily. high-quáalic; vo | 
and dara service anytime, anywhere. 
sounds good. Bur if you've med calling · 
at peak mourn in Les Angeles er Slew | 
Yor you know roday's celinlar улети | 





IS COMICS networks as capi- 
cipus and reliable as fibez-apac fines. : 
Within three oz feur years, wireless data | 


пеш офа wireless data | 
spesdis—through me use of improved : 
software and chip technology. «| 
karty digiral techoobogy char allow : 
bema ceaomgpreessiorn. 


Heiping w prod che $10.9 billion o=t- | 
| lular business imo the digial age is an | 
ambitious throng of newcomers that | 
| боре to wrest bie chunks of the wire- 
gams as Meles Cellular Communica : 
| spsrart ncrworks— — 
personal communica- 
попе services (PCS}, 
enhanced speczalized 
| mobile radios (EZ 
MRSL ind satellite- 
besed — getupi——are 
| pushing wireless cach 
nology to new limit. 
Says Sprinrs Cellular 
Vice-President Keith 
Pugluseh: “The basie 
queen how much 
at thé wireless market 
each wall gamez” 


















тыны wee 











m 
Г 


Bur most cellular COMES аге book- | 


ing for a bigger punch by using Code 


Division Multiple Access (CDMA), Ir | 
promised a 10- to 15-fold cxpacizy ime : 
: provement using a technology called ; 
| “spread specorum”™ char distributes à di- | 


Eitzzed me ace а wide range of | 
Frequencies. On che receiving end; фе | 


зоги rezssembles the signal. Iren М. 


: Jacobs. president of COMA developer 
; be widely available in 193 


will 


| же beefing up the ability of thet ner- : 
works m handie data so thar custormen : 


with wireless modems in notebook PCs || 


McCaw has begun roliing our а fts- 

tem developed by IBM called cellular 
digiml packer dam (COPD), which mens- 
mum = of дата through spare- 


ily d radio channels oc during gaps in 


converzation. Buc COPD works best for 
short files that can be quickie 
By the time cellular operators have 


| COMA systems in place, they are likely | 


upa. These networks, using frequencies 
die digital voice, data, 
ind paging—in z sin- 
Бе receiver. Spear- 
such as Nextel Com- 
necwurks Ife now op 


стил in only a few 
chica, however they 


Gorwide by 1996 
Bur che biggest 


order is hlelv to be 


ellite venture being funded by Меса 


caded by startups | 


arc expected tD po na- | 


threat to che eeliular | 


| PCS. А variation on cellular technology, 
prc | PCS divides an ares into many “mine | 
pv NEA. | celis" Thar means that handses m send | 


| angie PCS bandser will be your home 
|o The ulimase wireless advance will be 
: muly unlimimed cormunicarinns. Mor 


| customers willing to риу a suff росе the 
| врту to call ae т palit Gt thes 
1 planer Bur by the rime ИЗ operating in 
| 1995, it miy amount to noching more 
than “phe in the sky" says Ir Brodsky. 
: president of Comm Research in 
Wilmette, OL With all the advances in 
: lun-d-based wireless, the only place the 
“алуп, ете” promise won't exs- 


| anywhere 
: iy be fulfilled may be the Gobi Deer 
By Кели Kelly їл Ghee 


ne en ш> 









| ATM Switches 


O o 


| | 
|; We're all being conditioned t схрест | 
; mumediare grarifiestion in rhe digits] | 


| age—wim a alick of а burron we'll sum- 
; mon derum re stock Quotes, 
| escran shopping services, or simply 
| dini a phone cali. Bur we may ali be in 
; for a wet least until a new switching 
| technology, known as Asynchronous 
: Transfer Mode (ATM), comes inm wide- 
| spread use. When |с docs, ATM ability 
: to funnel billions of bigs to where they're 
needed “will make the nerwork to end 
г ill networks, gavs analysr Раш D. Cal- 
: lahan of Forrester Research Inc 

| ATM's trick? Tr divides the inforna- 
* tion inne ody packages, or cells, of 53 
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bytes cach. The cells ure coded, ог : 
stamped with an address and then : 
üpped over the network ar high speed. : 


^ such at rhe other end ads che cod- | , 


Ing ind reassembles the phone call ог 


e-mail message ar rhe other end. The 


speed is amazing AT&T savi ics ATM | 


d. 
роз 
| 


switch can pump 20 
or 1,500 copies of A 


af daa— 
Гисгё—ехегу 
соп 
Impressive. Bur relatively few cor- 
paneer teleohane у 
АТМ. The holdups: The lack of fully 
зала у ска) equipment and software — 


so thar all АТМ brands can work mgeth- + 
er—and steep pros Which means | 
companies such as U 5 Wess [пса 
Baby Beil are holding bark. АТМ would | 

inter : 
kinner dinc | 
wr of video engineering for U S West | 


| probably help D S Wess. 
acuve-TV mal Bur Russ: 


Communicanons, doesn't believe that’s | 


| the сазе: “Todas, ATM is not mature 


 aninteractive-TV pilor in Orlando using | 


| enough or cost-effecrive enough to pur 


in the network.” 


јез have | 


Compression 





Despite advances in semiconductors, | 
disk storage. and optioelecuanies, there 


- mav alwavs be a struggle to accommo- 


Thar could soon change. А S30-mem: | 


ber indusory consorium, called the АТЫ | 
Forum, i$ hammering out standards | 


The rwo-year-ald group is now tackling ` : 
wo crirical issues; The fimt is a atan- | 


dard for emulating lecal-ares networks 
patil пе super wen ink up и | 
АТМ nerwork 


conml—how то control dein four ha | 
enunmmous numbers of peaple amempt : 


to call up the same material ac the same | 


tme. 


ATM Forum President Fred Sammar | MPEG2, to compress 
ano predicts thar a standard for the local- | video. They work by 
area-network emulation ia just around | sipping our redundanz 


the comer, while one for congestion con- | 


trol should be completed by carly next © mountain in the back- 


wear 

At the same rime, АТМ switch йс | 
are falling. Today, switches for high- 
рее corporate systemi and large 


networks can cost up to $3 million, : 


while switches for small networks cost 
51.500 per port. or user. Prices are still : 
seep, hur they ve been curin half each | 
Wear since. 199] and are expecred ro 
корр falling аз ATM becomes mare 
Беса у adopted. 

Ina few years, prices will likely fall | 
to a "few hundred - peruser mys - 
James Chidelix, senior vice of : 
Time Warner Cable, which x launching ` 


ATM switching technology, Си ; 
adds “Today, ATM is miter. But that's | 
about to change in an expidsive way" 
When it does, the digital deluge can 
ка ЕЕ эы н UE. 


I the. 
| néeds w deal with the 


dare all che electronic traffic and find a 
place то park digital information when 
it» not m use. And that will Keep the 
pressure on co e scence 
ree fears of digital data into 


: presmon have already yielded ae 
sive resulm. The Join: Phomgraphic Ex 
| pers Group's standard compreses sail 


haa (йил Pero SEND 
| dards, MPEGI and 


information—esay, che 


computer pal 


information that 
- Changes, MPEG reduces 
the full video signal 
from 250 million bits 
| per second to L.5 mib 
| lion for VOR qumliry 
: Now, using MPEG and 
| ultrasensitive electronic cucu. fe- 
: seicher are able to trautmmit four chan- 
nels af TV over ordinary copper phone 
wiren impossibiliry a few years beck. 

But experts say that pr in came 
; presion may be slowing = oed already 
thrown away 99% of the video signal." 


| sivi John Fores, chief executive of Na- 
: пала Aranseoammunicsnon Lod, s => 


By Кау Кейі im Son Frane | 


ellite-equipment manufacturer in Win- 
chester, England. “Now, we have 


р 


Í nf chaos theory. Fractals work weli on 
: images of landscapes and seascapes 
| made up af recurring pattems, bur for 
fer. And like MPEG, fractal images are 
сону to encode. “On balance, Eracrals 


; wavelet algorithms ате a5 quick encod 
PNE M MEER 
| ким rely on їшїп motion 
images into one-fourth the space. The | berween ee 


Motion Picture Expers Group (MPEG) : 


THE BIG SQUEET 





un MPEG signal t b$ quick an 
. Cheap, encoding one is nme- 
TE and expensive. 
There are some promising de- 
nents that could pay off— 
Жа. Engineers have achieved 
irnpressive compression ratios using 
based on frzczals, a branch 


Quality tends to stif- 


don't appear to work any better than 
MPEG.” says Jules A. Bellisio, executive 


| director for video signal processing ne- 
| mons Research 


teuth ar Pet 





"Red Rank NI 


Wavelet theory is a more enticing al- 


| пете. Wavelet algorithms are very 
, Efficient at dividing che video image 
| coke coe into blocks and then de- 
| meri each block with relatively con- 
Айтеке in the rarhemancs af cam- ; Е 


Case таицтвизсм equations. As а result. 





frames as MPEG does, waveler 
images tend tn renin higher qualiry. 


can have lines on the 
screen that reveal the 
штаве 5 blocklike struc- 
turc, éven mo video 
that's termed “broad: 
cast quahty." The lines 
disappear only ac com 
prezgmon ritigi ок 
33 low 25 MPEG. 
Another sudon may 
lie in combining com- 
pression with more po- 
рї сай — display 
rechnalogies The hu- 
man CS Саап (asr 
visual information то 
the brain even though 


: the retina її а poor dara -pachwax Re- 


searcher suspect a video display with a 


- Web of thousands of шту maeroconmpnur- 
а-о for each picture element, or 


dot, af the viden t be able 
со generate good video i images with far 
еза dara chan convennonal videos now 
require. Such a display might emerge in 


: 2 ar 10 уз. H noc, we could srall in 


traffic on the Enformarion Superhighwas. 


Ву Ега? Gruteri in Nem Ford 







, still, wavelet images | 
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БҮТ IEA 


БАСЕВ 


QUALIER 


| [ nforma Hon 
sË power 
rc ] F * FT F: — 
| getting « heaper 
and not only big 
| business will benefit 


BUSINESS 





When à truck rolls mro the mainte- 
wince Dav at Ruder Tist ће s enin 
Beurmveick (X. 1.) facility, all Karen Rein 
ecke has то do is push a button tà learn 
instant’ whats ailing the vehicir. 
Reinecke, a technician for the #42 bil- 


rhe probe on the end of her handheld 
computer to a опу cam-shaped dial on 


E ui 


the muck 5 cab thar has been garherisg : 


inlorrmucon on engine performance and 


= no: ficu ғыр | | 
Tet CORAM Puan Иш: eiectoni sen- 






OVERVIEW 


НО WITH 
A VIEW: 

ту | ES 
wIETWOHR 
ЄС ЕН АТТА 
LEXTER 





sors under the hood. Gone is the piese | 


würk—wrong 5096 of the тите—-п find- 


Ing engine problems, And with the 
&Ourees of тте more quickly identi- 


HEJ, š truck 5 dewnrirme cin often be 


: | | - Cur in halt 
lior; trarisportanoe pint, simpy muche 


Informanon Age mest Road Warripr 
Launched in mid-March, Ryders Fast 
lmek Maintenance Service will capeure 
every bir—and buyte-— of informaocon on 
its trucks етос And thanks to 


al] che new dace, scheduling the com- | 


56 








atch rea neg ы ләр by the Informarion Revolution. Аз busi- | need ir will be disadvan - нуз | 





will be simplez inventory tucking © nesses depend more on collecting, ana | Robern M. Howe. a former Booz Allen 
and рала ordering more efficiens and Iyzing, and sharing information across - & Hamilton Ine consultant. who alios 
reports ro fleet customers far more de- : their operons, they're demand | 1991 Баз run ISA fiedeling consuling 
cubed. Better vez, Ruder will be able m | | worker skills for che Digital Age (page : business. 
use information it colleces on engine- | 110). А посети survey by compensation | Luckily, k now peneradion of chess 
p v pe longer warranties | consultants N. E. Fried & Associates | | computers and advances in software and 
Ш> aa NBRP SRP PASE 69 US compan used be- | fon it could and 
at most companies cut information it couldn't to find 
Klinger figures che $35 mik , | 
lion investment in new 
cutmp Uter sescems will pay 


















base of nor сак how m fix Ium dli 

the truck, but of failures, 

wa Cat жебеси | Tia aame of Sigil tochachay i Мара бнын 7 
panest act best in whar | impact оп businesses, their workers, and the supplier: 
applicztiom— whether irs ми contratos vno vnde vidc du, Fiat ew. 





AB 3 TTEE ZPO ПП 1% 
fut one мпп! part of a qui- | 
rt revolution i in РЕ way 


management layer and cut employment 














find easier and more effi- 
cient wave for thew em 



















levels. Meanwhile. companies are using less | IN eunt and 

plovecs, cusromeri, and costly computers and communication зе COS of cr 
ss Sasi to do business. devices to create "viral offices" from er is dropping 
© reason: Сатретитте workers in far-Hune locations | "E eu Is 
pressure is pushing ampe- ere. MM 5 КЕЗЕ PEM TCR 
nies to downsize even as The information “feedback loon” in i ч 
they improve both che collapsing development cycles. Compaction | nerworks of 
goods they make and the are electronically feeding customer and mar. | PCs or fest workstations 
E erir gi кек = keting comments to мез Меле о“ tind their processing costs 

ГЫ i ' ' 
lest, capone ныз sock teams so that chey can rejuvenate product actually jower than chose 


of competitors who took 
the plunge in the сапу 
| 1980s, when costly main- 
frames ruled che earth, In 
fact. USAN says ir will get 


lines and target specific consumers 

No langer simply an "order enay” jab, 
customer-service representatives are capping 
into companywide dambases vo solve callers 
demands instantly, from simple changes of 
address to billing adjustments. 


айз (BM and AT&T are scur- 
ming ro reengineer their 
businesses bv rethinking 
work flows and encourag- 
ing information sharing 
among once-autonomous 


hicfdoms such as purchas- 











| a leg up on: 
| and may spend half the 


amount rivals Amencan 
ing, munufscruring, and | Airlines and (nied Aii- | 


ME E But its not pur | - lines pimped | 
өй hie smaller companies and ; sic UD elite. And factory | йш computer ee 
plowed are discovering thac | workers ac world-class manufacturers | sore [Us using hundreds of workstations 

xard Gia drops and performance id- | such as Motorola Inc. must have the | j t5 tear through critical ticketing infor- 
vances in PCs, wireless communications, : math and basic computer skills to run | manon 


ovemiuhri—whach some compet- 
and business software have given them : computer-proerammed machinery ar use : iron take а couple of days w crunch. 
cheap, potent weapons to compete | statistical process controls to monitor | cing information aster and more 
gamely against big, deep-pockered rj- : quality on the production line. "The | accurately can dramatically alter the 


vals (page 108). : companies that do noc provide informa- ; rules. Look at how the relanoaship be- | 
== =a 0, are being | buffeted = don on sn accessible basis t thome tha | rwreen supplies and mnaman has 
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changed for remie In rhe 1970s, LS 


mecima samed ot replace clunky coh | 


iogan with elexuronie point-of-sale евр. 


Ed the wealth of information thev have 


оп their custorners—evervthing from | 


TOKES EFFORT: 
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y nie A МЕ 
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how oñen they use credit cards ro what 
color socks sell best on Friday night. 


"eoi, inated of taking product deliv- | 


enes when vendors диге, | 


iS 
such zs Wal-Marr Soores Inc. i | 


Fe ате telling manufacrurers what they - 


| want апо exactly when and where they 


tapping this informacion to fine-tune 


te có wir consumen 
sre buvmg "Poinr- 
of-sale [technology] 
Changed the balance of 
and consumer goods 
manufacturer." sav 
Стегаја Готи, former 
CIO of Prudential 
Insumnce Co. and now 
bead of a consulting ser. 


vice for Compucer Sei- | Е 


ences Corp.'s Index | M 
onic. Adds Шише & | 
T 


touche Managing Di- 
rector William Atkins 
‘TE you're the vendor, 


кетчї a ыш Шш Gs 
i 


| i г | " AL D i 

Fg ee A д. ор Чы ы аа 4 LR 
ira d ЈНА T a + LEN oa KAY u 

Рај па ЕР 5 Wl E ые = л 





vou're st the [wrong] end af the chain. 
Retailing is by по means the only tn 


| dusry in the midst of this info-terh rev- 
mina: that made mbulatine receipts : 
easier. Buz it wasn't until the [9894 | 
when scanners and bar code became | 
de ецеш thar cecuilers fully appreciar- : 


repond m Customer bids ог Ryder dig- 
tized garage, companies in almost 
every industry are keen co gain comper- 





| Whar they want, and the more you're 
: able to deliver product: that the cus- | from 
tomem want to buy,” says John W. - 
: Harper, chief financial officer for USAir | 
j Inc. “In some cases, you con even cre- ` 
= аге demand if you have the right infor- · 
; muon" Thar way USAL cam "rniem- 
need и. Meanwhile, savvy suppliers are | | 


and schedule tn fill irs Friday afternoon 





nba 


mine 


Scuences survey of infar- 


an corporanons found 


the ме. 1 Focus of their 
companies тте» 


IO ELS EE) vPELITHE iso OY PCR Ра 


Klinger кау the new technolog his | 


: comparny {з using is che only way it can 
: | ; boos из customer satisfaction mong to 
olution. Whether it’s IBM giving sales : 9555, from rhe current 8855, by the end 
people once closely guarded dara on : 


product costs so thar they сап quickly | 


ol 1995 ` 
Edward L. Schenk. seni vice-pres- 


: idene at United Services Automobile 


Assn. (LISAA) in San Antonin, goes fur. 


| ther. He aedis much of the insurince 
| Hive advantage bv employing digital : 
. technologies co manipulate information. 
: "| he more you know about your core 
men, the more you're able co predict | 


m ics commirment to technology In їшї 


г 15 years, USAA has mushroomed to the 


nanon's hfth-Drgesr private auto іле. 
er and fourth-bigerst 


cm covermpe, with 
SIE 5 billión in esses. 

Since 1969, CSAA 
has spent $130 million 


aging technologies to 
boost customer service 
пе  | and lower costs. To- 
eae (шү, 1с Doasts an mior- 
САНЕ muon serem so ad- 

| vanced chat it can 
Tack minute derails, 
such аз which auto 
most обеп Why both- 
er? USAA passes that 
data cm parts suppliers 
pars if there i$ a 
chance for improve- 
ment er if they can 
make them more 


| Cheaply The Big Three also get data 


tom ЫЗАА tn improve their parts. 


on computer and im- | 


kam 


== = = O —————————s 
— rr cr rr r -ÑD..—-......................................- 


Likewise, USAA had been trying for a | 


long time to Бет glass shops to repair 
"indews thar had punctus outside the 


drivers field of vision, bur no cracki. | 
| : Bur shops would rather pocker the $275 
- unsivze" data on the 160,000 people it | 
- cames each day to find the best fare | 
ther own production schedules acoard- | 


со replace an entire windshield than 
charge $35 to repair it. So even though 


|| САА offered to waive the deductible if 
Пе} from Puzybureh «о : 
Harrisburg | OWEN were convincing drivers to re- 
ра- | place the whole thing. Only when USAA 
nies most абери at éx- : 
plaiting technologw го | 
are secking | 
something basic to | 
please cussomen In: 
fact. a recent Computer ; 


started capturing data and publishing 
the repair record of various shops in ics 
newileuer did this scar co abate. The 
chops realised where they seod relative 
to the compeddon and didn't want гю 
lose L'S44'1 referrals. The percentage of 


[nur yea 
To be sure, not all companies have 


| been as successful in making huge in- 
that customer service m : 


: repairs zoomsed to 2855 from 5% in just | 
TALON Systern Manag | 
ets ax U. 5. and Europe- | 


vestments in technology рат off. The | 


: problem. consultam за has been rhe 
| traditional view of песто" РЕКА 
tn technology. Rvder | 


simpli по eut ensis and support already- 


m 


7 
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existing Operations. [n the carly 1980s, : 


ститурагиез such аз General Moms Corp. 
were convinedd chat huge mvesumenr 


"In the past, the technology wis soon 
aa š perquam. рына for umplemen:- 
ing rhe erage,” says Joe (Carter, manag 


т: center in Palo Alm. Cal- 
i. "Today, technology is the зли 
Rv | 
tia systems now am | 
volver GEM dur on | 
custamécr "We have to 
реза ad Beeson Toe 


"Fora $7 bilon bebe 





"Now, 
comes fmm providing 
greater accen о the іпотеки." 

But и, companies have to know Í | 
their customers in even more detail i 
| Dell Computer Carp. CHO Thomas L. | 


power 


Thoms siya says new computer symems ће | 


ts inerslirng will make Deli de "Niel- 
sen of the compurer bursinëesg By us 








to be abie to anneipare their needs and : 


ШЕ new puma n, feck essanem de- | 


eall them wich just the produce they're : 


Е mus likely m buy—sry, more memory | 
г 0r new software. 

in TO m autre manufacturing : 
plna would boost productivity. Ir ций. | bua: 
other things, enable а regissered гори | 
| enon in amy coumtry о pull up dari on : h 
: any Dell PC—in his or her aurive | 
ing Фински of Andersen Сота ти | 


Dell is also investing heavily in a glo | 
ifam sys that will, among: 


tongue. That does awey with the need 


: to punt manuals m many diferent (ап- 


: guages. Over nme. Del] even plans to | 




















INFORMATION PRIORITIES HAVE 


CHANGED RA "Y 





| ега will be able co order additional PCs | 


| und receive shipment without ever mlk- 

Ing ma salesperson Lawr they will be | 

filed buying habits, Dell hopes one day | able so pull bet technical information to 
diagnose an 


fix probleme an deir own, 
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rather than tapping Пећ service desk 

Indeed, customer service is ane of 
| ethnology invesument's hodesr areas 
Good cumnmer-service systems can сш 


4. ован cp ek аа ааа ume 


| dl they found rhe right department or 
: person tr send our a regairpezson or pen 


cess an order Thats because operare 
could do fede more than 
mke down basic informa- 
tion and pass the сег 
along. “H proved po be 
nor very g Customer 
Ex service but aiso very Er- 

ЈЕ | pensive rp w” says Lam 
ny Russell, a vicz-nresi- 
dent of СТЕУ telephone 


compared with one in | 
200 just over а усаг agp. 
"We'll be able во provide 
а service m thres 
dayand wai be ћи 
бо it with much lower 
| серіоуес comets," pre- 
| dicum 


i ei d syuterns га » help Hate and юса | 
Breemmenr reduce delay: az toll plazas. 
With Amtechy шетт, COMMITS Us š 
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build argunrzznons and systems: Сви can : 
М Te quickly to markerplace changes : 
= 


Ar Lexas [пштитпетиз Inc. the moro ta 
"Change faster than change,” says CIO 


| b "Bob" McLendon. TL which : 





HOW TECHNOLOGY 
TRANSFORMS WORK 


Innovation and 
technological change 
create winners and 
losers. WalMart 
rises апа Sears fails. 
Microsoft triumphs 
and [BM slumps. 
The same is true for 
labor: Some workers 
suffer job losses, 
while others get paid 
to ride the high-tech 
revolution. Right 
now, the squeeze on 
jobs is most obvious 
and worrisome. But 
over the past 200 
years or so, there 
has been no 
long-term trend 
toward higher 

| unemployment 

| because of 
Investment in mew 
machines and 
technology. 
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г: Spends more chan 445 of rerenuet об 


Е 1 ш Leader in the use of technoloe—irt has 
| | had а sophisnested e-mail system for 15 
; Wear, for example 

| avoid falling behind competitors is to } 
| 

| 

| 

| 


row, McLendon t wane that exper 
tse m build what he сај “the virtual 
factory,” a svsrem to let TI build any 


: ргобист апу time at апт of in factories 


worldwide—swith all the engineering 





Over the past 


Do daca ma 


Information 


lers sbilled and 


Services 


Min mg 


ПЁ} DIRE Ee, 


1 ин Шш. DTE 


г I. Already, TI s product designers can 
information technology, has long been | 


тшт Designs and equipment senp 
Шаттанат automated manufacnur- 
ing sites globally. For example, TI keeps 
its biz million-dollar chip пете in the 
Philippines, but they're canrrolled by 


| msrengpineen in Houston. While Ti gall 
| has a way tp go, McLendon boasts that 
| me semiconductor gmun reduced cycle 
| ume from order to delivery an astound- 
specs and invoices moving eleceronical- i 


But it has been а 
With mt leet seme 
college education. a= 


M anufactu rin E 


ing 39 lass year—laigely by linking its 
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service Sector productivity is picking up smarty. One 
ПИКЕ UG COMPUTERS ON ШЕ Ж. П Le 





| Finance, Insurance, and real estate 
Public adnunistration 


Transportation, communication, 
and other public ши 


Wholesale and retail trade 
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COMDUDETS into one global network : 


Мырат) fram a Spáce-ape ommal cen 
n [qe da Dallas 1 1 B 1can ree ngineel Етік 


cesses сип the f—though the chips mov Ë 


be ordered in pan, dev Sloped {п Ter- 
ай. and mude їп the Philippines. 
Lf concen 1 


of, howeve4, compantes will “have to 
change their carporte caulnires in eddi- 


tion ta adding new digital technology. : 





Technological change and office automation wil 
shrink these job _ 
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Comp uter operators 
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machine operators 


| Fe lepho ne operators 


Typists and word processors 





Bank tellers 


Business investment is skyrocketing as the cost of labor 
70 | 
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ШТА КЖ Dr ша OTe 


: акад departments aru ther custo 


. Ышш informaron becomes mal- i ing, we would start to evaluate people 


bath dà the virtual ћеш : 
| ту or the vit се are poing to шаке 
= nalogy то allow employees in aha infer 
mation, but forger that sharing ideus i ; 


Billing, posting, and m 
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Be your cwn boss. Computers linked to the Superhigh- 
way could open up new opportunities for entrepreneurs. 
T 4 | : 


Аз companies break down che bermen 


ћу l critica tD апт оса ап ъ застае" 


| ive Thomas H. Davenport Је. a con 


sutranr wich Emst & Young 
The problem, Davenpor: angues, is 
that many campanes spend Бед on ech- | 


an unmnzrunl act" in oponie culrures 





ite tech wortd. 


-39% 
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Computer engineers ана scientists 





UNEN т тт YW ә ч 
MIL dE És юш ШИН 


char rewaerd аги за | воће “Tr 
| we nex cared about informanian xhar- 


Dy how well they share," says Laven- 


: port. With. rhe amount of information 


that flies wound organizations moming 


: daily focusing on informanon 


instead of jet астын echnolegy— 


| muy be the rr revolution. 


Wess Райт feram m Dats ama! dorem 


rri 


-but technology also generates пече openings in tha 
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| Physical therapists 
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Companies are doing with less. Claruc whim-enllar 
workers аге being replaced by high-tech machines, 
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on government policy. Look ar the effect of permit- 
опр the Baby Bells into new business. 
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E MISSION-CRITICAL VIEW m 
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The Disappearing Programmer 


8 AS OBJECT-ORIENTED between now and the year 2010, Note that the vertical scale js 
logarithenic. You can categorize proerammen in four areas: 








CONCEPTS TAKE HOLD, THE * [5 department programmers. These programmers work di- 
Pu d ii potat at Their numbers are in fairly steep 
OR ECH | decline, anc probably reduce to the hundred thousands 

NEED FOR SPECIALIZED by the year 2010, from about ten million. 

. Salumi pragramumerz These programmers work 

PROGRAMMERS IS DECLINING. рулу e ec 


Ages applications. The experts expect their numbers to rige to 





Ё „им а technicin 
6 program sporadically as part of their pro- 
fessional activities. In other words, this category encompasses | 
the professional end user. The number of such individuals will 

likely rise to more than 100 million by 2010. === 





and a whole host of other products to increase productivity 


tools, databases, OÓ languages, automated testing sofrware, We assume that our group of assemibled experts is correct, the 
! of voltwa ( T i 
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lor major c 
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Robin Bloor iz chief executive of the compucer ressarch and con- TPS Crowth-rate predictions for fou: bur alt | 
sulting fim se trator Lil. Yau сви comtact ins firm ai 44-908 ШШ orasan? Motive Ghat the wean | 
У УП tm the ЇЇ) lagnzrithmic. 


Sapna these asserted expert were раа i a se 
| ot papers. Figure 1 shown one set of predictions | 
growth rate of four specific types of professional programmers 
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will li the gap, or a combination af both. | n" ire system builder's point of view, it is about tallorable and 
comes a reality. Малу commentators have observed over the Of course the problem af providing reusable software com- 
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Element names are not case sensitive 
Proposed features are marked thus? 


Documents start with DOCTYPE followed by head and body enclosed in: 


«HTML»... </HTML> 

Head is enclosed in: <HEAD>... </HEAD> 

Body is enclosed in: <BODY>... </BODY> 

Comments are written: <!-- A Comment --> 
DOCTYPE, <HTML>, <HEAD> and <BODY> may be omitted 


Sample Document 


<!IDOCTYPE HTML PUBLIC "-//W30//DTD. HTHL//EN//2.0"» 
<HTML> 
«HEAD» <!-- A Sample Document —-> 
«TITLE»Document Title</TITLE> 
</HEAD> 
<BODY> - 
«Hi»First Header«/H1» 
<P>Paragraph one. 
«pL» 
«DI»Term«DD»Definition 
«/DL» 
</BODY> 
</HTML> 


Head Elements 


<TITLE>...</TITLE> title (length < 64 chars) 

<ISINDEK> document is searchable 

«BASE HREF="url"> base URL of document 

«LINK ...> relationships to other objects 

<META ...>Í embed meta-information, attributes: 
HTTP-EQUIV, NAME, CONTENT 

«NEXTID N=id> next identifier to be generated 

Headings 


Headings, level 1 to 6, specified by: 


<Hn>heading text</Hn> 


Spacing 

<P>text...</P> start new paragraph 
<BR> force line break 
<HR> horizontal rule 
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Images (level 1) 
External graphics are specified with links. Embedded images are specified with: 


«IMG SRC="url" [ALTz"description"] [ALIGN="..."| | ISMAP|> 
ALIGN can be one of: top, middle or bottom 
An image map is an embedded ISMAP image that is also an anchor. 


Example 


<A HREFs"http:/cgi-bin/imgshow/beowul f/f196"» 
«IMG SRC="http:/beowulf/f196.g1f" ISMAP 
ALTz"Folio 196 of the Beowulf manuscript»«/A» 


Forms | (level 2) 


Data input forms are enclosed within 


«FORM [ACTION="... "| METHOD="...° ENCTYPE="...">...</FORM>} 
Fields defined by <TEXTAREA>, <INPUT> and <SELECT> 


<TEXTAREA NAME="..." ROWS=r COLS=c>.. .X/TEXTAREA» 
multi- - text field 
<INPUT [...]>... 
iteli; p CHECKED, MAXLENGTH, NAME, SIZE, SRC, EnA 


TYPE can be one of: 
checkbox, hidden, image, radio, reset, submit or text (default) 


<SELECT NAME="..." [MULTIPLE]>...</SELECT> 
alternatives specified by <OPTION> tags 


<OPTION VALUEs"..." [SELECTED] [DISABLEDÍ]» 


Example 


<HR> «FORM ACTIONs"http:/cgli-bin/script"» 
<P>Name: <INPUT NAMEz"name" TYPE="TEXT" 512Е=20> 
<P>Operating System<SELECT NAMEz"os"» 
<OPTION> UNIX<OPTION>VMS<OPTION>MS-DOS</SELECT> 
<P>Comment: <TEXTAREA NAME="rem" ROWS=3 COLS=c0> 
I think that ...</TEXTAREA> 
</FORM><HR= 


Special Characters 


&lt; < less than symbol 
&gt; » greater than symbol 
Gamp; & ampersand 

&quot; К double quote 


Enbe non-hrenkiné sprice 


—t ee Él . 


68198; 
60193; 
61194; 
#4192; 
#4197; 
#4195; 
61196; 
601199; 
&H208; 
#8201; 
#8202; 
60200; 
&H203; 
&5205; 
BH206; 
60204; 
61207; 
#1209; 
#1211; 
81212; 
61210; 
&il216; 
#1213; 
61214; 
60222; 
&#218; 
&#219; 
#1217; 
#1220; 
8221; 
#1223; 
##230; 


| &d225; 


#4226; 
60224; 
ё#229; 
ё#227; 
&H228; 
&H231; 
BA240; 
&#233; 
21234; 
#4232; 
1235; 
#8237; 
81238; 
80236; 
80239; 
#4241; 
#8243; 
#1244; 
#8252; 
80248; 
&#245; 
ёй256; 
@#25&; 
ё#250; 
&#251; 
BH249; 
#1252; 
#4253; 
#8255: 


&AElig; 
&Aacute; 
&Acire; 
&Agrave; 
&Aring; 
&Atilde; 
&Auml; 
&Ceedil; 
SETH; 

& acute; 
&Eciro; 
&Egrave; 
&Euml; 
&lacute; 
&locirc; 
&lgrave; 
&luml; 
&Ntilde; 
&Oacute; 
&Ocire; 


&Ograve; - 


&Oslash; 
&Otilde; 
&Ouml; 


&THORN; 


&Uacute; 
&Ucirc; 
&Uprave; 
&Uuml; 
&Yacute; 
&szlig; 
&aelig; 
&aacute; 
&acire; 
&aprave; 
&aring; 
&atilde; 
&auml; 
&ecedil; 
&eth; 
&eacute; 
&ecirc; 
&ejfrave; 
&euml; 
&lncute; 
&icire; 
&iprave; 
&iuml; 
&ntilde; 
&oacute; 
&ocire; 
&ofrave; 
&oslash; 
&otilde; 
&ouml; 
&thorn; 
&uacute; 
бисте; 
&ugrave; 
&uuml; 
&yacute; 
Кит: 
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AE diphthong 

A, acute 

A, circumflex 

A, grave 

A, ring 

А, tilde 

A, dizeresis/umlaut 
С, cedilla 

Eth (Icelandic) 

E, acute 

E. circumflex 

E, grave 

E, diseresis/umlaut 
I, acute 

1, circumflex 

I, grave 

I, dieeresis/umlaut 
N, tilde 


| O, acute 


O, circumflex 

O, grave 

O, slash 

O, tilde 

О, dizeresis/umlaut 
Thorn (Icelandic) 
U, acute 

U, circumflex 

U, grave 

U, dizeresis/umlaut 
Y, acute 

German sharp s 
ae diphthong 

a, acute 

a, circumflex 

a, Brave 

a, ring 

a, tilde 

a, digeresis/umlaut 
c, cedilla 

eth (Icelandic) 

e, acute 

e, circumflex 

e, grave 

e, diwresis/umlaut 
i, acute 

i, circumflex 

|, grave 

i, diæresis/umlaut 
n, tilde 

o, acute 

o, circumflex 

o, Brave 

o, slash 

o, tilde 

o, dizeresis/umlaut 
thorn (Icelandic) 
u, acute 

u, circumflex 

u, grave 

u, digresis/umlaut 
y, acute 

v. diveresis/umlaut 
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Lists 


List items are preceded by <LI>, except in definition lists, which use <DT> and «DD» 


pairs. 

<OL>...</0L> ordered list, items numbered consecutively 
<UL>... </UL> unordered list, items marked with bullets, ete 
<DIR>... </DIR> directory list 

«MENU». . .«/MENU» list of short items 

«DL [COMPACT|>...</DL> definition list 


Block Formatting Elements 


«ADDRESS»text... аде 258 Information 
</ADDRESS> 

«BLOCKQUOTE»text... quoted text 
</BLOCKQUOTE> 

<PRE [WIDTHzn]»text... preformatted text 
</PRE> 

Highlighting 

Logical Markup 

<CITE>...</CITE> citation 

<CODE>... </CODE> code example 

<DFN>...</DFN> ` defining instance? 

<ЕМ>,..</ЕМ> emphasis 

<KBD>...</KBD> keyboard input 

<SAMP>,..</SAMP> Literal characters 

<STRIKE>. . . «/ STRIKE» eteilke-oucl 

<STRONG>...</STRONG> strong emphasis 

«VAR»... «/VAR» variable name 


Optical Markup 


<B>,..</B> bold 
<Ї>...</1> italic 
<TT>...</TT> fixed-width 
<U>...</U> underlined! 
Links 


Anchors can be a link to another location: 
<A HREFz"url" ...»anchor text...</A> 
or the destination for a link: 
«A NAMEs"name" ,..>anchor text... «/ A» 


Other attributes: 


aca” s ом 1 = á = в 9 -. 
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The Hypertext Markup 
Language 


This chapter introduces HTML (Hypertext Markup Language), the system that 
the Web uses for marking up documents. Later chapters look at the more 
advanced features such as including images in documents and setting up fill- 
out forms. 


4.1 Overview of HTML 


Web documents are written using the Hypertext Markup Language (HTML). 
Originally there was no rigid definition of HTML and no mechanism for ex- 
tending the language. This led to the situation where different groups added 
features that would work with their browsers but not necessarily with other 
browsers. (A notable example is that of fill-out forms; these were added by the 
NCSA for the X Windows version of Mosaic, but were not supported by other 
browsers — not even by Mosaic on other platforms.) 


Recently there has been a move to standardize HTML. The original language 
has retrospectively been designated version 1.0 and version 2.0 will define 
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eurrent practice. Within version 2.0 there are three levels of features, 0, 1 
and 2, which correspond to differing degrees of browser sophistication, level 0 
being the simplest and level 2 the most complex, Version 2.0 is currently 
described in a draft Internet Request For Comments (RFC), which is expected 
to be finalized at the end of 1994. 


IITML is a document type deseribed in the Standard Generalized Markup 
Language (SGML), a system for formalizing the structure of documents and 
enabling documents to be interehanged between different document process- 
ing packages. Starting with version 2.0 HTML is formally defined as an SGML 
document type definition (DTD), which means that the definitive word on 
what constitutes legal HTML is embodied in an SGML definition. SGML is 
defined by an ISO standard (ISO 8879) and is deseribed in The SGML Hand- 
book[10] by Charles F. Goldfarb. Practical SGML|15] by Erie van Herwijnen 
is a food introduction to SGML. 


Already there is discussion going on about version 3 of HTML, which will 
add new features to the language, such as tables, figures and mathematical 
formulae. This ean probably be expected to appear some time in 1995. 


4.2 Getting started quickly 


It is useful to know how Web documents are structured, even though there 
are editors available that will let you create Web documents without such 
knowledge. HTML documents consist of plain text interspersed with markup 
directives, called tags. Tags are instructions to the browser software on how 
to display the text, and are represented by strings enclosed in angle brackets, 
for example «TITLE», Figure 4.1 shows what the source for a simple HTML 
document looks like and Figure 4.2 shows what this looks like using X Mosaic. 


<TITLE>An English country garden«/TITLE» 


<Н1>Ап English country garden«/H1» 


The garden at Hidcote Manor could be said to combine the maximum formality 
of design with the minimum formality of planting. It is devised as an 
interconnected series of outdoor rooms, enclosed by walls or hedges, each 


with a distinct theme, and each affording a tantalizing glimpse of the 
next, just sufficient to Lead you on to explore further. 


<P> In places the garden opens out to frame a far-reaching view of the 
surrounding Cotswold hills. Elsewhere the atmosphere is intimate, as in 
the cottage garden where four rather dumpy topiary birds, cut from box 
plants, face each other in a cosy circle. | 





Figure 4.1 HTML source for а simple Web document with minimal markup 
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<HEAD> 

«TITLE»Document Title</TITLE> 
</HEAD> 
<BODY> 


replacing the literal string Document Title between the «TITLE» and 
</TITLE> tags with your documents title. 


e Put a </BODY> tag after the last line of the text. 


e Find each heading in the text and put a start heading tag at the beginning 
of the line and an end heading tag at the end of the line. There are six 
levels of heading, from «H1» to <H6>, where «H1» is the highest level. 


e Put a «P» tag at the start of each paragraph in the text. 


This will leave you with a document with headings and broken into paragraphs. 
You may want to add other features, such as emphasis or links, described later 
in this chapter. 


A fast way to learn HTML is to look at the source of existing Web documents, 
particularly those you consider well put together.’ Most browsers have an 
option, View Source, which will pop up a window containing the raw HTML. 


4.3 Structure of documents 


The simple ПТМІ, document introduced in the previous section omits a num- 
ber of HTML tags. If the document was prepared with an authoring tool, the 
missing tags would probably be supplied automatically and the source would 
look like Figure 4.3. 


The first line is a DOCTYPE directive and says that this document uses version 
2.0 HTML. If this line is omitted, version 2.0 HTML is assumed. The rest of the 
document is enclosed in an HTML container element (see below); again, this is 
assumed if it is omitted and is therefore not strictly necessary. Most existing 
browsers allow you to omit these lines. They are shown here for the sake of 
completeness. 


HTML (and SGML) regard a document as a logical hierarchy of elements. 
Hence, elements are the structural components of a document. Elements 
start with a tag identifying their type. An element can be a single entity, 
such as an included image or a special character. These do not require an 
end tag. Alternatively an element might be a chunk of data or text, which 
logically requires a terminating tag, in which case the element is referred to 
as a container. 


End tags may be omitted if the end tag can be implied by what follows. For 
example, </P> tags need not occur In a document since they are implied by 
a <P> tag or in fact by any text; similarly the first <P> tag after a heading can 
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«IDOCTYPE HTML PUBLIC '-//W30//DTD WWW HTML 2.0//EN'> 
<HTML> 
<HEAD> 
«TITLE»An English country garden</TITLE> 
</HEAD> 
<BODY> 
<Hi>An English country дагдеп</Н1> 


<P> The garden at Hidcote Manor could be said to combine the 
maximum formality of design with the minimum formality of 
planting. It is devised as an interconnected series of outdoor 
rooms, enclosed by walls or hedges, each with a distinct theme, 
and each affording a tantalizing glimpse of the next, just 
sufficient to lead you on to explore further. 


<P> In places the garden opens out to frame a far-reaching view 
of the surrounding Cotswold hills. Elsewhere the atmosphere is 
intimate, as in the cottage garden where four rather dumpy 
topiary birds, cut from box plants, face each other in a cosy 
circle, 
</BODY> 
</HTAL> 


Figure 4.3 Complete markup for a simple document 


be implied by the presence of text. This means that existing documents will 
conform to HTML version 2.0 with regard to paragraph marks, even if, looking 
at the raw HTML, the paragraph tags seem to be strangely placed (often at 
the end of the last line of the preceding paragraph from the viewpoint of an 
HTML version 2.0 browser). Originally <P> tags were paragraph separators and 
terminated a paragraph. They are now being redefined to start paragraphs. 
This is being done so that attribute information for a paragraph, such as the 
type of justification, can be included in the leading paragraph tag. This is 
also how SGML does things, and efforts are under way to make HTML more 


compatible with SGML. 


Some start tags may contain attributes, which further define the characteris- 
tics of the element. Most attributes are specific to each element type and are 
described with their related elements below. Attributes usually consist of an 
attribute name followed by an equals sign and a value. The value may be a 
string literal enclosed in either single or double quotes or it may be a name 
token. It is best always to put quote characters around attribute values in a 


tag. 
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4.4 Naming schemes on the Web 


The World Wide Web uses a universal naming scheme, the Uniform Resource 
Identifier (URI), to identify and address documents and other resources on 
the Net. This scheme is deseribed in an Internet Request For Comments 
(RFC 1630), and encompasses a number of schemes already in general use 
and some which are still being developed. Two new schemes, the Uniform 
Resource Name (URN) and the Uniform Resource Citation (ОВО), are under 
discussion, which together will allow copies of resources to be distributed 
across the Web and facilitate retrieval of the closest or cheapest copy. These 
make use of the current scheme used on the Web, the Uniform Resource 
Locator (URL), which expresses the address of a resource and the method by 
which it can be accessed. i 


4.4.1 Syntax of URLs 


This section describes the syntax of URLs in detail. You may want to skip 
straight to Section 4.5 on an initial reading, 


The general syntax of a URL is: 
scheme:path 


The scheme identifies the protocol, such as HTTP, Gopher, F TP, and so on, 
that the browser should use to access the resource, The interpretation of 
path depends on the protocol being used. For many protocols path is taken 
to be a hierarchical name Including a host name and optional port number. 
По names are preceded by a double slash (//), Case may or may not be 
significant within the path, depending on the operating system on which the 
server is running. Port numbers are numeric identifiers that specify which 
server program on the server machine is addressed, These are standardized 
for standard protocols: Gopher uses port 70, IIT TP uses 80, and so on, and 
where the standard port is used it need not be explicitly stated in the URL. 


The path may be followed by a query string or a fragment identifier. 


Query strings can be phrases used to locate indexed documents. They are also 
sometimes used to pass coordinate data from image maps and user input from 
forms to a server. They are indicated ру a question mark (7) Following the 
path. Within a query string spaces may be replaced by plus signs (+), which 
means that real plus signs must be encoded. 


Fragment identifiers are indicated by a hash sign (#) followed by a name at 
the end of the URL. They are interpreted by the browser as the address of 
locations within a resource, and are not actually passed to the server. 


Partial URLs can occur in documents, These are interpreted by the browser 
as being relative to the URL of the current document, using rules similar to 
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| | syste ;.. and. are 
those used to resolve filenames on the UNIX system. The миш al 
taken to mean the next level up and the current level respectively. 


URL-enecoding 


There are a number of special characters that cannot be directly included in 


E. d the space character. These 
a path part of a URL: +, <, >, "о, /, ? an : А r 
n s Wow as а per cent symbol (X) followed by the hexadecimal value 


of the character in the 1850-8859 character set. Spaces that represent word 
boundaries are encoded as plus signs. 


4.4.2 URLs for different information systems 


if T | sed 

This section describes the specifie syntax for each а d е s 
tion s The type of a document is indicated : 

information system protocols. | ent ted. a 

from the protocol. Documents of a particular type are not restricted to being 


retrieved using a particular protocol. 


HTTP 


The Ilypertext Transfer Protocol is the native method of TL 
ments on the Web. It is the protocol most often used for accessing 


documents. An IITTP URL has the syntax: 


http://host[:port]/path | 
TTP se 0; if the server is listening on this 

The standard port for ПТТР servers is port 80; | 
m 8 do ae have to specify the port number explicitly. Examples of HTTP 


URLs are; | 
http://info.cern.ch/hypertext/WWW/Tools/Overview.html 


http://wintermute.ncsa.uiuc.edu:8080/auth-tutorial/ 


' Gopher 


Gopher is a precursor to the Web that views information as a Veris | 
menus, which may contain text and other format files. It is still in widesprea 


use. Gopher items can be specified as: 


дорћег: / /host[:port]/[type[item]] 


The standard port for Gopher servers is 70 and if A is рк men кА or 
: | he concept of a selector to r 
have to be specified. Gopher uses t be ше 
me. The type value Is encoded as а: 
type of a resource and its pathname | 
аги. the most common being 0 for text files and 1 for menus. The type 
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is explicitly included both as the first character of the pathname, or i tem, and 
as a separate field, and thus occurs twice after the host name. 


An example of a Gopher URL is: 
gopher://gopher.mi cro.umn.edu/11/ 


This refers to the top-level Gopher menu at gopher .micro.umn. edu, and could 
be abbreviated by dropping the tralling 11/. 


File Transfer Protocol 


File Transfer Protocol (FTP) is one of the oldest mechanisms on the Internet 
for retrieving files from remote machines. Files and directory listings can be 
specified in URLs using the ftp scheme: 


ftp://[username[:password]à]host/path/fi Le 


By default the username is anonymous, the username for anonymous file trans- 
fer. Explicitly specifying a different username is not recommended if a pass- 
word is required for that account as it will have to be encoded in the URL, 
which may be stored in plain text in documents. 


An example of an FTP URL is: 
ftp://ftp.w3.org/pub/ls-lR.72 


File 

The file scheme allows files on your local system to be specified, in which 

case the browser will read the file directly. The syntax is: 
file:[//hostname]/path 


A host name can be specified so that browsers can determine that the file 
referred to is not on the local system. They may then use another scheme, 
such as anonymous FTP, to try to retrieve the file. | 


An example of a file URL is: 
file:/usr/Local/Lib/ghostscript/README 


News 


The news scheme allows USENET news groups ог articles to be specified. 
It differs from the other schemes in that no host is specified in the URL; 
your news host Is usually specified directly to the browser by an environment 
variable or some other means. 


The syntax for newsgroups 18: 
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news :newsgroup 

and for individual articles: 
news:article-id 

Examples of actual news URLS are: 
news:comp. infosystems.www.providers 


news:bwh.2.0010809Caaccess.digex.net 


4.5 Basic HTML elements in detail 


This section describes the basic HTML elements. 


4.5.1 Comments 


Comments can appear anywhere in an HTML document, except within a tag, 
and are enclosed between the strings <!~~ and ==>, like this: 

<!-- This is a comment ==> 
Comments may not be nested: browsers will regard everything after the first 
comment terminator as markup. 


Although strictly speaking the end of comment is denoted by the string 777, 
some browsers regard the single character >, without the preceding double 


hyphen, as a comment terminator. 


4.5.2 The DOCTYPE directive 


The DOCTYPE directive is an SGML construct that identifies the type of the 
document as being HTML. This should be: 


<IDOCTYPE HTML PUBLIC '-//W30//DTD W3 HTML 2.0//EN'» 


If the DOCTYPE directive is missing, this information should be assumed by 
browsers compliant with HTML version 2.0 or later. 


4.5.3 The document head 


The head contains meta-information (information of a higher order) about the 
docuntent, such as the title. The head is identified by the HEAD element. This 
can be omitted, but it is better to include it as it allows server software to find 
out information about the document without having to search through the 


whole document. 
The following elements can occur in the head: 


LE 


38 Spinning the Web 


TITLE The title of the document. 

ISINDEX Indicates that the doeument is searchable. 
BASE Specifies the URL of the document. 

LINK Specifies relationships to other documents. 

NEXTID Indicates the next identifier to be generated (for use by 


authoring tools). 


META specifies meta-information about the document, 


None of the head el | | 
детета are compulsory, although ; 3 i 
l T h k vk E] 
mn y, gh a TITLE clement is rec- 


Тат ds element probably needs no further explanation, but note that only 
: characters may be included and that the length should not exceed 
4 characters including spaces. 


ү IS INDEX element tells the browser that the document can have a query 
x кы су, to its URL and the server will then invoke a seript to perform 
search accordingly. This is described in more detail in Chapter 11. 


The BASE element takes-a single attribute | : 
apar "y. kes.a single attribute, HREF, which gives the URL of the 


The LINK element descri ; 
escribes the relationship of the « 
ir E ос › * 
menti p locument to other docu- 


The NEXT ID element specifies the next anchor label to be automatically gener- 
ated within the document. This is used by IITML editing tools, to kee track 
of hypertext link labels (see Section 4.6). It has no meaning to d казне: апу 
you don't need to use it if you are writing a document by hand. | | 


ока d ак in HTML version 2.0 as a ‘catch-all’ to allow 

` be included ation that isnt covered by any of the other head elements to 
cluded. It takes three attributes: NAME, HTTP. EQUIV and CONTENT. The 
information is named either by the NAME or HTTP. EQUIV element. For example: 


<META HTTP.EQUIV="Expires" 
CONTENT="Tuesday, 19-Apr-94 18:47:05 GMT"> 


4.5.4 The document body 


. The body of a document compris | 
prises the actual document contents ;- 
played. This includes headings, text and images. iii 


——— 
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Headings 


You ean have up to six levels of heading in your documents, marked from the 
highest level, H1, to the lowest, Hé; for example, the top-level heading: 


«H1»An English Country Garden«/H1» 
may be followed at the next level by: 


«H2»The Vegetable Plot«/H2» 


Paragraphs and line breaks 


Unlike other document markup systems, HTML ignores empty lines embedded 
in the document source and runs the text together into a single paragraph on 
the sereen. Paragraph breaks must be explicitly marked with the <P> tag. The 
</P> end tag can be omitted, but you may find that these are automatically 
inserted by HTML authoring tools. 

The <P> tag usually generates extra vertical space between paragraphs. If you 
want to start a new line without extra vertical space, use the <BR> tag. This 
is an empty element, which means it does not have an end tag. The <BR> tag 
is often used within the ADDRESS element (discussed later in this chapter) to 
separate lines of an address. | 

The <HR> tag, which is also an empty element, creates a horizontal dividing 
line across the screen. It is often used to separate blocks of information or to 
visually delineate fill-out forms. ; 


Special characters 


HTML uses the character < to start a tag, so you cannot use this character with- 
out a browser interpreting it as markup. Similarly the double quote character 
is used to start and end attribute value strings. 

In order to represent these characters in your HTML documents you must 
use the entities &Lt; and &dquot;. То get a literal & you must use the entity 
&amp;. Although HTML uses ISO 8859 for its character set, entitics can also 
be used for non-ASCII characters such as accented characters in case you 
cannot enter such characters directly from the keyboard. Table 4.1 contains 


a list of the standard entities. 


List elements 


There are a number of HTML elements for defining different types of list within 
the document body: 


e Unordered lists (UL) 


EL 


+. 
о 
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Figure 4.5 Ап ordered list 


and directory lists are variants of unordered lists, and are 
intended for lists of short items that can be displayed in a compact style. The 
items on a menu list are frequently set up as hypertext links to create the 
functionality of a menu. Each menu list item should be a single line and a 
directory list item should not be longer than 20 characters. Some browsers 
display menu or directory lists in the same way as unordered lists, while others 
display them without the bullets that are characteristic of unordered lists. 


Both menu lists 


v 
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<H1>Table of Contents«/H1» 


<MENU> 
<LI> <A HREF="sectioni">Section 1</4> 
<LI> <A HREF="Asection2">Section 2</A> 
<LI> «A HREF="Hsection3S">Section 3«/A» 
</MENU> 


Definition lists are intended for lists of terms and their definitions. The term 
is preceded by a <DT> tag and the definition by a «DD» tag. Tt is permissible 
to have a number of terms preceding one definition. Definition lists are often 
used for glossaries, for example. Figure 4.6 illustrates the use of description 
lists and Figure 4.7 shows how it is displayed by the X Mosaic browser. 


<TITLE>Parts of a plant</TITLE> 
«H2»Parts of a plant</H2> 


«DL» 

«DT» Bract 

<DD> Leaf below the «EN» cal yx«/EM». 

<DT> Calyx 

<DD> Circle of leaf-lLike material which forms the outer case 
of a flower bud. 

<DT> Petiole 

“<DD> The stalk joining a Leaf to a stem. 

<DT> Spadix 

<DD> Closely arranged spike of flowers, usually enclosed by a 
«EM»spathe«/EM». 

«DT» Spathe 

<DD> Large <EM>bract</EM> or pair of bracts enclosing the 
<EM>spadix</EM>. 

</DL> 


Figure 4.6 HTML for a sample description list 


The COMPACT attribute can be specified in the <DL> tag to suggest that the 
browser should display the definition list in a compact form, minimizing the 
amount of space between successive pairs of items. It may also reduce the 
width of the term (DT) column. 


Definition lists can be used to create fancy bullet lists using an icon in each 
DT element, as shown in Figures 4.8 and 4.9. Purists consider this to be a 
misuse of the construct, but currently there is no other way to achieve this 
effect within IITML, 


Highlighting 


HTML has a number of elements for highlighting text, which can be categorized 
as logical markup and visual markup elements. Some of the highlighting 


| | Pa i of 1 plant du Ai PM i Мый 
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Figure 4.7. А sample description list 


| nts dicate the 
elements have not yet been ratified. These elements are indicated with 


word proposed in the lists below. 


The following elements are logical markup elements: 


<CITE>...</CITE> citation 

<CODE>...</CODE> Code 

<DFN>...</DFN> defining instance (proposed) 
<EM>... </EM> Emphasised Text 
<KBD>...</KBO> Keyboard 

<ЅАМР>. . .</SAMP> Literal characters 


<STRIKE>.. .</STRIKE> strielcout-text (proposed) 
«STRONG»...«/STRONG» strong emphasis 
«VAR»... €/VAR? variable name 


The following elements are visual markup elements: 


<B>... </B> bold 

<l>... </I> italie 

<TT>...</TT> | fixed-width 
<U>...</U> underlined (proposed) 


i | } i tion 
Logical markup is generally preferred to visual markup as the interpreta 
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is less rigid and more configurable. 


Block formatting elements 
three block formatting elements: ADDRESS, BLOCKQUOTE a 


element is used to format postal addresses, signatures, email 
type. The content is generally displayed in 
This element is often added at the 


id date that the document was last 


There are nd PRE. 


The ADDRESS 
addresses and information of this 
an italie font, indented or right justified. 
bottom of a document giving the author а! 


changed, for exam ple: 


«ADDRESS» 
Andrew Ford (A.Fordüicarus.demon.co.uk), 28 October 1994 


«/ ADDRESS» 


d for including quotations in a document. A 


The BLOCKQUOTE element is use 
indented both left and right. The browser 


new paragraph is started and text is 
may display this text in a different font. 


reformatted text</TITLE> 
gdom«/H1» 


<TITLE>Example of p 
<H1>Seed Sowing Times in the United Kin 


Acanthus mollis 
Dianthus neglectus 
Helleborus orientalis 
Papaver somniferum 


</PRE> 





Figure 4.10 HTML for preformatted text 


relormatted text can be included in HTML documents by using 
These are displayed in a fixed-width fónt and can be useful 


when the formatting of your information is critical but HTML does not provide 
the facilities to format your information as you want, An example is tabular 
material (until HTML version 3.0 is in widespread use). This is illustrated in 


Figures 4.10 and 4.11. 


Sections of p 
the PRE element. 


</PRE> tags line boundaries are respected. Anchors 


and character formatting are allowed and tab characters expand to one or more 
the next character appears on а character position that is a 
6 the PRE clement will be displayed by 


Between the <PRE> and 


spaces such that 
multiple of eight. All text formatted usin 


9/ 
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When using 
extra Sarit sen put the matching «/PRE» at the start of a line, s 
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4.6 Hypertext links 


A hypertext link is a pointer from a place i 
‘Abita simol t a place in a document to another destin: 
bea ШЙ ГЫШ AK pain s ea different onmont. The dea natdonnduhi 
a sound file, or a labelled teria ie such as an external image, a video clip or 
айг document. Íl | и! In the original document, or a labelled point i 

. Hypertext links are what puts the hyper into имои, 


Hypertext li E nl 
inks refe : 
pied “a PEL non-HTML resourees usually cause an external 
А r6 - reiper applicati 2 ч k x 4 1 i 
display or pk ation, to be invoked by tl 
) ay the resouree, 8 | y the browser 

Shwe qt ed esource. Setting up browsers, including the use of E 

programs, is discussed in Section 9.4.1 g the use of external 


Both the starting poi 

as anchors МАЊА ари and the destination of a hypertext link аге referred 

или те Њуз Шү by the anchor tag <A>, Anchors muy have dide 
y must have one or both of the NAME and the HREF ратая 


<A NAME=" " = || 
пате HREF="dest-url">Aighlighted text</A> 


The HREF attribute | 
qo а napa that the anchor is the start of a hypertext link and 
est-url) is the destination URL. The browser highlights 
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and interprets elieks on the text as a 
hould follow on immediately 
diately follow the text, with- 
ill be highlighted, which 


the text between the «A» and </A> tags 
request for the document referenced. The text s 
atter the <A> tag and the </A> end tag should imme 
no embedded spaces, otherwise space characters w 


looks silly. 
k which 


The NAME attribute specifies that the anchor is the destination of a lin 
has been set up elsewhere. 

attributes, METHODS, REL, RE 
ted by browsers. 
ality 


V and URN, also optional, are not 
Their syntax is deseribed in the 


The remaining 
is still being discussed by the Web 


commonly used or suppor 
HTML specification but their function 


development community. 


lere are some examples of how different ty 
up. 


<A О жү л M" 


a hypertext link [ 


This eode could be used to ereate 
//waw.somesite.org 


document to another document. http: 
is the URL of the destination document, 4 


highlighted text in the original document. 


cing a non-document resource such а 


Lines referen 
clip are set up in a similar way: 


им, somesite.org/image.gif^»an image</ A> 
ite.org/audio.au 7a sound</ A> 


«A HREFz"http://w 
ite.org/video.mov"»a movie«/A» 


«A HREFz"http: //www, somes 
«A HREFz"http://www.Somes 


The browser makes no in terpretation of the URL, 
the named server, which will determine the type 
the filename extension, d nd send the resource togethe 
type to the browser. The browser will attempt to star 
to display any resource that it cannot handle itself. 
Hypertext links can be created from one 
location in the same document: 

«A HREFz"Hnext, topi c"»jump to the next topic</A> 


The destination must have a named anchor: 
<A NAMEz"next, topic"»text«/A» 


You can also make a link to a named anchor in a separate document; 


«A IREF-" http: Гнын. sones ite. org/doc.htmLnext. topic"? jump 
to the next topic«/A» 


pes of hypertext links can be set 


rom one location in a 
/document.html 


nd description will appear as 


s an image, audio or video 


merely sending a request to 
of resource, generally from 
r with an indication of its 


t an external application 


location in a document to another 


LE 
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When the anchor is selected a browser may use the optional TITLE attribute to 
display the title of the document being fetched. Of course, the title specified 
in the anchor may be different from that in the document being fetched. 


<A HREF="http://www.somesite.org/document. html" 
TITLE="An English Country Garden">an English garden</A> 
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Chapter 3 


A Brief Introduction to 
Decision Analysis! 


We'll start simply, with decision trees. T then say а word about utility 
theory, then something about multiattribute utility theory. These build upon 


decigion trees. 


3.1 Decision Trees and Their Analysis 


Usually, when we are faced with e decision we are also faced with some 
significant related uneerfamhy. We must decide for whom to vote, but we 
are uncertain about who would be the most effective leader. We must decide 
how to price & particular product, but we sre unsure of how various prices 
will affect sales of the product. And so on. It is in fact quite easy to think 
ир examples of decisions that need to be made in the presence of significant 
departures from full knowledge of the consequences. Indeed, such decision 
contexts are the rule, not the exception. Let's now consider а simple example, 
а simple decision problem that we shall draw lessons from and model. 





LF ile: di-decision-analysis. | Revized: 851222, 951128, 951022, 851023. From: 
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Figure 3.1: Decision Tree for the Parking Meter Problem 


3.1.1 A Simple Example 


Consider the parking meter problem. You have parked your car in а metered 
clot and the meter has run out. Your decision is whether ог not to plug the 
meter. H you plug the meter, it will coat you, say $1.30, but vou are sure of 
not getting а ticket. If you do not plug the meter. then you may get a ticket. 
(for $15.00) or you nöt, depending on whether the parking enforcement 
person comes by during the next hour. 

De you want the sure thing, for а cost of $1.50, ог do you want to take 
the chance thet you will get a ticket? Suppose you. could make à pretty solid 
guesset the probability of getting a ticket, and say that probability was 0.4. 
You might then reason roughly as follows. If you plug the metet, vou swill be 
out $1.50 for sure. If you do not plug the meter you have а 60% chance of 
‘being out nothing, and a 40% chance of being out $15.32 [the ticket plus the 
postage stamp to mail your money їп). Qn average, if you do not plug the 
meter, you expect lose 80.00 + 0,6 + $15.32 0.4 = $6.12. This is considerably 
more than $1.50, so you ought to plug the meter, 

[t iz helpful, both here and in general, to draw а diagram to represent 
the situstion. Figure 3.1 is such a diagram—talled a decision tree—for the 
parking meter problem. The diagram will help us understand how to gener- 
alize the sort of reasoning just expressed. In the diagram, we begin with the 
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B] 


node labelled] E. The fact that it is a box indicates that it Is a derision node 
in the tree, The decision at hand is whether to plug the meter (follow the 
line to node C) or not to plug the meter (follow the line to node D). IT we 
plug the meter, we arrive at & sure outcome, node C (Indicated bv an oval), 
with outcome О». In our example, Oy = —$1.50. 

[[ we do not plug the meter, we arrive at a chance node; node C (indicated 
bv a circle), where one of two things could happen. First, we might, with 
probability р (equal to (0.4 in our example! get a ticket. If so, then we get to 
node A. an outcome node as indicated by the oval, with outcome O; (equal to 
-$15,32 in our example), Second, we might not get a ticket. The probability 
of this is 1— p = 1 —0.4 = 0.6 If so, then we arrive at outcome node В and 
receive СЈ = $0.00 in our example. 

In terms of our decision tree, the reasoning we went through above might 
be expressed as follows: Node C has а value Os. If node D had а value, 
V3. then I could make my decision (at node E) simply by choosing the larger 
of Oy and V4. 1 can get а reasonable value for node D (since it 15 a chance 
node) by getting the average or expected value of the node. That value is 
n. О + (1— gp) - О; = —04-$15.32 — 0.6 $0.00 = —$6.128. 

Thus, we have illustrated how decision trees are folded back or pruned. 
What has worked in this example works generally, The idea is to work 
backwards (conventiallv, from right to left im the tree diagram) from outcome 
nodes to decision or chance nodes, assigning values to the nodes until all 
nodes are given а value. Values of the outcome nodes are assumed given. 
The value of в chance node 15 the expected, or weighted average, value of its 
daughter nodes (nodes, conventionally, to the right). The value of a decision 
node is the maximum of the values af its daughter nodes, 





3.1.2 Comments 


1. The “right” decision to make is the decision with the highest expected 
value. We assume that our decision trees begin on the left with a 
decision node representing our fundamental choice. Should we plug 
the or not? 


2. The parking meter problem is perhaps the simplest of all nontrivial 
decision trees, but it is also à model for many real problemas. 


Lo 


4, Decision trees may be elaborated to essentially an arbitrary level of 
complexity. Basically, this is done by any or all of the following: 


[а] Replace an outcome node with а tree, e.g. with а chante node 
followed by two (or more) outcome nodes 


(b) Add one or more daughter nodes to a chance node. 
(е Add one ar mare daughter nodes to а decision node, 


But remember: a decision tree is a model of н real situation. [n building 
any model, judicious decisions must be made balancing fidelity to the 
real phenomena with simplicity and tractability. Here, as elsewhere, 
the KISS (= keep it simple, stupid] injunction is well worth remember- 
ing. Also: Think of the outcome nodes as representing à large amount 
of unmodeled information. They are stopping points, perhaps ошу 
temporarily, in the analysis- H further reflection reveals that the mod- 
eling needs refinement, then it i а simple matter to replace outcome 
nodes with more extensive trees. 





‚ As models, decision trees should always be subjected tà post; puelcha 


analysis. Particularly important is sensitivity analysis. Do small 


in the model's parameters greatly affect the recommended devision? 


For the parking meter problem, the parameters are the values of the 
three outcomes and the probability of getting a ticket. These are all 
éxdeencus to the pronn, but need to be examin 


One way of performing such sensitivity analeis ie to ask a series of 
what-if questions. ti „ард ы омо this can often be ac- 
complished simply by changing the value in a cell thai holds the value 
of the parameter in question. then having the spreadshe 
Nothing could be easier. Whitt analysis la vos usstol In poni. but 
limited by the fact that it is not systematic. Spreadsheet programs typ- 
ically offer a one-way and two-way data table capability. This can be 
used to perform post-evaluation analysis in a more systematic fashion. 
Further, spreadsheets tend to offer goal-seeking eatures. These, too, 
can be very useful for post-evaluation analysis. For example, in the 
parking meter problem we might want to ask how low the probability 


of getting в ticket would have to go before we would decide not to plug 
the meter. 
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Figure 3.2: Venn Diagram of Probabtlities 


5. There are many other interesting and useful manipulations of decision 
trees. Pir the most we shall resist the temptation to discuss them, 
excepting the discussion next, in 93:3. For information on these and 
other decision tree manipulations, sez the references mentioned in $3.7 


3.2 Conditional Probability 


Before we go further, it will be useful to review some basic probability theory, 
especially with regard to conditional probability. 

In general, for any two events (or sets of events) a and B. Plaji) is the 
probability that a occurs given that б occurs, and it is defined as follows. 


š dei P(an B) 
P(aj|3) = РО 


The expression (о ri B) means "а and 8" and is said to represent the mier- 
section of events (or seta) o and 8. Similarly, the expression (a U! 8] means 
"cor Д" and is said to represent the unton of events (or sets) о and B. So, 
Pla ri 8) represents the probability that both events, а and 0. occur. Simi- 
larly, Pla 2) represents the probability that either event c occurs or event 
8 occurs, or they both occur. 

Now, it is important for you convince yourself that the definition of con- 
ditional probability, equation 3.1, is in (ве! a good one, Let us try drawing 
some Venn diagrams and thinking about the definition. 

In figure 3.2 we have a representation of all possible events, U, by the 
area within the rectangle We assume that PIU) = 1, that is, the prob 
ability of any event being within U ia 1. Equivalently, РОГ) = РО = 


(3.1) 
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0, that is, the probability of any event being outside U is zero. For ex- 
ample, if we were tossing a die, then the set of possible outcomes, О, is 
13,12), £3], 141 (5), 161. U, the set of possible events, encom 
pim of O, including the null set, Q, аз well as unions and intersections 
‚ Thus, H n под Й ore events, then so are their union, (a U B), 

their | interiecto (e 8). Finally, if a is an event, then so is # (= o. 

Inside; m the space of possible events, we have two ovals. The left oval 
represents the A event, the right oval the B event, and C the event consisting 
of the intersection of А and B, Le, C = (AM В). For example, H we were 
tossing adie, A might stand for the event of getting an even number and B for 
the event of getting a number less than or equal to 3. C, the intersection, is 
the event of getting a 2. ir P vs propa ay of A? It is the probability of 








aer tipo adorab, Le, P{{2}U{4}U(6)}). The probability of B 
в P(B) = РЦ Чо Зу ai aa probability of A, given that B has 


aly Le, PLAIS)? Reflection on the diagram, figure 3.2, would suggest 
that А oecurs when B occure-only if C occurs and the probelillity of C when 
B occurs is Just the ratio of P(C) to PLB). But, since P(C) = PLANE), this 
is exactly what the formula (equation 3,1), ie, the defintion of conditional 
probability, tells us. Indeed, the definition i$ a good one, 


3.3 More Information 


Suppose voi are standing near your parking meter, having reflected on 
whether to plug the meter, Just as you are about to insert the coins, an em- 
Lerprising street person approaches vou with an offer to sell you information 
as to the wheresboutg of the parking ме гето eni personnel. Your deeision 
problem is now complicated and you need to deliberate further. There аге 
three cages for you to consider: 


L The street person will report with completely accurate information. 
That is, if the street person reports that there will be no visit from 
& representative of the parking authorities, then there will be no such 
visit, (and you will not get. a ticket) during the next hour; and similarly 
otherwise, the street person is entirely relisble. In this case, you need to 
armine EVPI, the expected value of perfect information (since you 
sé offici perfect. information), and use the results of this calculation 

to assess the price being asked by the street person. 
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2. The street person will report with probabllistically accurate informa- 
tion and you know (we assume with certainty) the reliability of the 
street person's reports, in the sense that you know the probability of a 
visit by the parking authorities, given that the street person says there 
will be à visit, and so on. In this case, you need to determine the EVSI 
[expected value of simple information) with outcome-given-report iti- 
formation. You will use the results of these calculations to Assess the 
price being asked by the street person. 





3. Finally, the street person may report with probablilistically accurate in- 
formation and vou also know (we nssume with certainty) the reliability 
of the street person's reports, in the sense that vou know the proba- 
bilities of making the various reports, given that the various outcome. 
will occur. For example, you know the probabiliry of the street per- 
son's saying "There will be a vigit by the parking authorities during the 
next hour," given that there will not be such а visit, Note that this is 
distinetly different from the case (above) of EVSI with autcome-given- 
report information. There you have, eg.. the probability of there nof. 
being a visit by the authorities given that the report savs “There will 
be a visit by the parking authorities during the next hour.’ We call this 
third case the case of EVSI with report-given-outcome information. 


We now consider each of the three cases individually. In each case, we 
will alter our original decision tree, Figure 3.1, calculate or determine some 
new probabilities. and fold back the tree. 


3.3.1 EVPI: Expected Value of Perfect Information 


Suppose the street person's report will be entirely accurate, and we could 
obtain the report without cost. This resulte in a revised decision tree. shown 
in Figure 3.3. 

Notice that the essential change is that the event providing information 
about whether there will be a visit by the parking authorities (node A in 
Figure 3.3) precedes rather than follows (as in Figure 3.1) any decision on 
whether to plug the meter. This is equivalent to having nature move “first,” 
and nature's choice is revealed to vou before you have to decide whether to 
gy neu eh ma pe The expected value of the 

: t in ation (le. the tree or tree fragm 
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Vile: dt-parking-perimct 951023 





Figure 3.3; The Parking Meter Problem with Perfect Information. Note: V" 
(Plug’, etc.) and V (Plug, etc.) are alternate notations for the complement 


of the set У (Plug, etc.). 


ЕЈ 


0.4-—51.50--0,6-80.00 = —50.60. Why? Begin at the left of the tree in Figure 
3.3. The report, will either be favorable (АПУ) —"No visit") or unfavorable 
(R(V) =“Beware, a visitor will come!"), Since the probability of à visit is, 
as before, 0.4, then we must assume that the probability that this perfectly 
reliable repotter will report the unfavorable outcome with probability equal, 
as before, to 0,4. That is, РЈУ] = P(R(V)) A similar argument. applies to 
any other possible reports, although here there is just one: PIV) = P(R(V )). 
So, with probability of 0,4, you will hear "Beware, a visitor will come!" and 
you plug the meter for $1.50, since 0-1 =—$1.50 and 0-2 = —$15.32. With 
probability of 0,6, you will hear “No visit," and you will not plug the meter, 
since 0-1 = —$1.50 and O-3 = $0.00. 

On average, your expected loss is —$0.60. This is the expected value of 
your decision with perfect information. The expected value of your decision 
without perfect information (ог any new information at all) ie, as we have 
seen, —$1.50. The EVPI for your decision is-just the difference: 


90.890) = (—S0.50) —1—31.50) (3-2) 
Here, and in general, the expected value of perfect information for a decision 
is equal ta the expected value of the decision with perfect information, minus 
the expected value of the decision without additional information. Think af 
it this way: 


EVP! = Ема] = EVDwith 





ot — EVDwithPriars (3.3) 


So, what.do you do? Assuming you are still working under the assumption 
of deciding based on expected monetary value, you ehould only buy the 
information if it coats less that $0.90, 


3.3.2 EVSI with Outcome-Given-Report Information 


Of course. it is unrealistic to think of the street person approaching vou with 
a business proposition as being perfectly reliable. What this person offers 
may not have the value of perfect information, but it may still have value 
and you may rationally want to listen to it. 

Figure 3.4 shows the basic scenario for the parking meter problem in the 
reserve of imperfect information. As before О-1 = —$1.50, 0-2 = 515.32, 
and 0.3 = $0.00. 
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Figure 3.4: The Parking Meter Problem with Impertect Information 


As in the case of БҮРІ we begin on the left with a chance: node whose 
daughter nodes, RIV) ("Beware, a visitor will come" ) and R(V) ("Мо 
visit" |. represent the possible outcomes of the report vou are considering to 
buy. Represent the probabilities of these events as P(R(V)] and P(R(V )), 
where PLR(V)) = 1— P(R{V)). If the report is unfavorable, RIV), the out- 
come may or may not be favorable (V, a visit ог not, V], These probabilities 
are now conditioned upon the outcome of the report. That is, e.g., given 
that the report is unfavorable, there is а probability that the actual outcome 
is favorable, We symbolize thie probability as P(V|R(V )). 

The tree in Figure 3.4 Пан all uf its chance node daughter branches labeled 
symbolically, with P(R(V)), PCV |A(V)), and so ов. И we are to use this 
decision tree tó make decisions, we will need to determine the actual values 
for these probabilities, If all the actual values are given, then wetan fold back 
the tree to get the expected value of the decision with sample mfonnation 
(EVDwithSI), and we can then calculate EVSI much as we did EVFI 


УЗ = EVofSI = EVDwithS! — EVDwithPriars (3.4) 


But of course, typically not all the actual values for the required prob- 
ebilities are given directly. There are then two interesting, and common, 
Cases) 


|, The probabilities of the possible outcomes given the possible reports 
are known, and the other probabilities are to be calculated. This is the 
outcome-piven-report case and we consider it in the present section. 


2. The probabilities of the possible reporta given the possible outcomes 
are known, and the other probabilities are to be calculated. This: is 


the report-given-outeome case and we consider it in the next section, 
83.2.3. 





Assume, [от our present, example, that we know the following probabili- 
ties: 


L P(V) = 0.4. We have this from the original problem statement. 


2. P(V) = 1— P(V) = 0,6. Again, we have this as part of the basic 
problem. 


3. PIVIR(V)) = 0.7. The probability of an actual visit by « parking 
authority person, given that the report from the street person says 
there will be such а visit, la 0.7. How do we know this? XA student 
association at the local university has performed а careful scientific 
study on the past performance of the relevant predictions bw street 
peuple and has arrived at this number. 

Similarly, we have the following probabilities. 


4. P(VIR(V)) = 1— P(VIR(V)) = 0.3. Read this ак The probability of 
not having а visit, given that the report says there will be a visit, is 
0.3. 


5. P(V|R(V )) = 0.08 

6. P(VIR(V)) = — PUIRT) = 0,55. 

With all this given; we still have two required probabilities undetermined: 
P(R(V))-and P(R(V)). And using the fact that P(R(V)) = 1 — P(R(V)), 
we are left with one probability to compute. How do we do it? 


It is useful here to invoke a very generally useful probabilistic identity. 
Let a and 8 be any two веш representing events lor sets of events), Recall 


that 8 ("alpha bar" = а' or “alpha prime") is the set of eventa that obtains 


if any event outside a obtains [and similarly for other events, eg, ñ and 
Bj. Then, if а and 3 are disjoint, Le, à 7153 = 0, then by the axioms of 
probability theory: 

Pla шй) = Pla) + Р(8) (3.5) 
(Note: This identity is intuitively correct and vou should convince yourself 
it isin fact correct.) Next, we have а set-theoretic identity. For any sets а 
and б, 

«= (ап ум (а 12) (3.6) 

(Note: This identity is alea intuitively correct and vou should convince your- 
self it is in fact correct.) Further, since (а 03) (an B) = 


Р(а) = Бапа) Plan 0) (3.7) 


Retail the definition of conditional probability, Equation 3.1 above, repeated 
here as Equation 3.8. 





def Pian Pia па) 


Pag NS 


(3.8) 
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It follows that | 
Р(а|8) - P(8)- P(a n B) 
Substituting into Equation 3.7 we get an important and genera 
identity, | | Ë 
Pia) = Р(а|8) - PU) + Plata) - P(8) (3.10) 
Substituting our present values into Equation 3.10, we get 
P(V)= P(V|R(V))  PLR(V)) + P(V|R(V))  PCRCZ)) (3.11) 
Solving for P( R(V )) we get 





Р(Н(У = (P(V) - P(V|R(IZ))) 





(3.12) 


(P(V|A(V)) — P(V REV ))) 


or 


(0.4 — 0.05) 
(0.7 — 0.05) 
When we fold back the tree we find that if R(V | ie. if the report is that a 
visit is coming, then we plug the meter and the expected value of this fork 
is 0.54 - —51.5() = —80.81. On the other hand, if the report is that a visit is. 
not coming, then we do nol, plug the meter and the expected value of this 
fork is 0.46 - (0.05 · —515.32) = —5$0.35. Taken together, the EVDwith&l је 
—).81 — $0.35 = —$1.16. Since EVSI = EVDwithSi - EVDmwithPri 


P(RIV)) = = 0.54 (3.13) 





EVSI = —$1.16 = 81.50 = $0.34 (3.14) 


3.3.3 EVSI with Report-Given-Outcome Information 


Suppose instead of the mformation we had in the previous case, 63.3.2, we 
have the following. 


1. P(V) = 0.4. We have this from the original problem statement. 


2. P(V} = 1 — P(V) = 06, Again, we have this as part of the basic 
problem. ; 


5 
б. 


-PRVY = 0.7. The probability that the report from the street 


person says there will be visit by a parking authority person, given that 
such а visit actually occurs is 0.7. How do we know this? A student 
sssocintion st the local university has performed a careful scientific 
study on the past performance of the relevant predictions be street 
people and has arrived at this number. 


Similarly, we have the following probabilities 


 P(R(V)|V) = 1— P(R(V)|V) = 0.3. Read this аз: The probability 


that the report sgys there will not be а visit, given that there is a visit, 
is 0.3. 


PUR(V IV) = 0.05. 
PUR(VV) &1- РАЈУ У) = 0,55. 


Here, we need to calculate more quantities than ш 93.3.2. In particular, 
we need to find 


1. 
2. 
d. 


P(R(V)). (Now: P(R(V)) «1 — P(R(V)).) 
РУТА). (Note P(V|R(V)) = 1— P(VIR(V)).) 
P(V|R(V)). (Note P(V|R(V)) = 1. — P(VIR(V)).) 


P(R(V)) is essy. Recall an identity from 83.3.2. Equation 3.10, reprinted 
below as Equation 3.15 


Pia) = Piali. P(8) + Р(е 2) - PR (3.15) 





i«bstibuting m our known or assumed values, we have 


от 


P(R(V)) = P(R(V)|V) : Р(У) + P(R(V)]|V) - P(V) (3.16) 


P(R(V)) = 0.7, 0.4 0.05 0.6 = 0.31 (3.17) 


Notice that 


P(R(V)) = P(R(V)|V)- PCV) + РАСУ ЈУ) - PIV) (3,18) 
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or 
PR(VY) = 0.3. 0.4 + 0.95 - 0.6 = 0,69 (3.19) 
Now we consider how to calculate P(V|R(V)). To make this calculation 
it will be helpful. even essential, to have available to us another probabilistic 
identity, known as Bayes's rude, To derive a form of this rule, we begin with 
the definition of conditional probability, given above as Equation 3.1 and 
reproduced here as Equation 3.20 








| уд ви Pleni) 


Since fa 8) = (8 Пе) we rearrange the right-hand side-of Equation 3.20 to 
get 





(3-20) 


P(alg- 02 T. (3:31) 


But since; by the definition of conditional probability 





P (Brya) = РЏШ а) - Pla) (3.22) 
we substitute into Equation 3.21 and get: 
Ра) - Pta) 

Р(8) 
Equation 3.23 is one version of Baves's rule. And it is just what we need. 
substituting our quantities (or their symbols} into Equation 3.23 we have 


P(V|R(V)) _ P(R(V)IV) PY) 


P(a|8) = (3.23) 





PUR[V)) TM 
Dr 
P(V|R(V)) = € var = 0,00 (3.25) 
P(Vjn(v)) = МАУ - PU) сее п" (3.26) 
СЕ | 
P(V|R(V)) = 21е = 0.10 (3.27) 
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PRN (3.28) 
DT | | 

PTRD) = 25525 = 048 (3.29) 

А that 

т ES P(R(Y)|V) - PV) 

"(VIR(V)) = РЇЇ (3.30) 
ot | 

РА) = E a = 017 (3.31) 


that a visit is coming, then we plug the meter and the expected value of this 
fork is 0,31: —81.50 = —$0.47. On the other hand, if the report is that a 
Visit, is not coming, then we again plug the meter and the expected value 
of this fork is 0.69- — 51.580 = —81.04. Taken together, the EVDwithsl is 
—80.47 —$1.04, which, taking into account rounding errors, is —51.50. Since 
EVSI = EVDwithSt - EVDwithPriors, we have 


When we fold back the tree we find that if RIV), ie, if the report is- 


EVMI = —31.50 — $1.50 = $0.00 (3.32) 


This information ie not worth buying. 


3.3.4 A Tabular Approach for Doing Certain Calcu- 


lations“ 


The calculgtions we just illustrated using Bayes's rule are correct, but many 
find the-alzebra forbidding. So, here we suggest an alternative approach, or 
really representation since the underlying approsch is the same, to problems 
calling for application of Bayes's rule. 

Beginning abstractly, suppose we are given 


. Р'а|8) 
. Pial) 
Piol) 
- Р(&{й) 
5. P(8) 
6. РІЙ) 


d сы) hy — 


?Thanks to James D. Laing for suggesting the approach described in this section. 


and we wish to Bind 
L. Pla) 

2 Pia) 

3. Pila) 

4. P(Bla) 

s. РІЙ 

6. PIAS) 

We can express the information given in tabular form, as in table 3.1. 





Table 3.1; Tabular Approach to Bayes's Rule: Conditional Probabilities of 
Rows, Given Columns 


Now, using just the information given in table 3.1, we can form а second 
table, table 3.2. 


[а] Pais): at (ај -P ај) 1 





Table 3,2: Tabular Ренин h to Bayes's Rule: The Given Information, Plus 
One Step 


But if we just simplify the entries in table 3.2, шапе the dehnition of 
conditional probability, we get the joint probability distribution shown in 
table 3.3. 





Table 3,3: Table 3.2 Simplified: Joint Probabilities of Rows and Columns 


Notice that in the right-hand column of table 3.3 we have two of the six 
items we are in Search of, Pia) and P(t), In order to get the other four 
items we seek, we use the information in table 3,3 to get table 3.4. 





Table 3.4: (Almost) Final Table 
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But table 3.4 just simplifies to table 3.5, which contains the last four 








Table 3.5: Table 3.4 Simplified: Conditional Probabilities of Cournus, Given 
Rows 


3.3.5 A Note on Generalization 


The concepts described in this section apply to more than the parking meter 
problem as presented. In fact, they apply to any decision tree, no matter 
how complex or whether or tiot the decision criterion is expected monetary 
value. Many of the formulas we used, however, might give you a different. 
impression. Because we worked with only two branches from any node, all of 
the "general" formulas were expressed in terms of ст, Т. and ñ. In the truly 
general case, these formulas may be expressed in terms of a, 3), -+n Ba. The 
key to generalization is this. The formulas so far rely upon. the: fact that 
ane = (Band # are mutually exclusive or disjoint) and SUS = 0 (8 and 
BH аге mutually exhaustive; together they cover all possibilities, represented 
by the universal set, 12]. So long az: 





L ñ, YB; = 0 for all i,j where 4 # 5 
2 B uB UU... Bue 0 





| | ан а ну елын described i in 55. 3.4, general- 
bic in the сећа wsys to буда RRQ 8),...., БА. 





3.4 Utility Theory 
Two questions: 


L. Decision trees seem nice, but what if the outcomes aren't measured 
in dollars? What if outcomes are ји lives saved, or hassle avoided ог 
whatever! 


2. Why should we decide based on expected value of dollars? Aren't we 
just taking expected values because we don't know what else to do with 
chance nodes! 


These are good question: and it turns out that their answers are closely 
related. 

Taking the second question first, it is indeed true that there 15 not апу 
obvious real alternative to reducing chance nodes to their expected values. 
If we are going to reduce а chance node (or situation) to a single number, 
then expected value is the natural choice, if only for lack of an attractive 
alternative. Nor is there any attractive alternative to reducing chance nodes 
to single numbers, given that we want а numerical representation of the value 
of our best decision. 

But if we are to take expected values, how should we measure the values 
of what it is we are taking the expected values off (We're still on the second 
question, above.) Are dollars the right measure? Here's an example; due to 
Jacob Bernoulli and called Bernoulli's paradox, that shows that dollars are 
not always the right measure. Suppose you are offered the opportunity to 
participate in a game. The gate works as follows. We begin with a fair coin 
(there are no tricks bere and everything is as certifinbly legit ag vou want). 
We fiip the coin. If it comes up heads, you win $2 and the game is over. If 
it comes up tails, you win nothing, but we continue to flip. If on the second 
toss the coin comes up heads, you win $4 and the game is over, and if not 
we flip the coin a third time. АП in all, we flip the coin until a head comes 
up &nd then we stop. You win $2? where n is the number of flips it takes to 
get the first head. Nice game, but tn not giving this away. You have to bury 
в ticket in order to play. Ask youself what the most is that you would be 
willing to pay for such a ticket. Remember: you can have all the assurance: 
you want that everything is done on the up and up. 





Suppose (with no loss of generality) that yon sre willing te pay 550 to buy 
а ticket for this game. Put another way, if offered а | 
$30 or getting to play the game, soa ovid tu йшй ыр. Noi harto pay 
$31 to play the game, vou would rather keep упш monev, but you would 
gladly pay only $29 to play. Бо, you sre пи игала betwen: Farini #60 and 
being able to play this game. Their values nre about equal so far as you are 
concerned. Fine. Now, we know that you value the game at $30, but what 
te the expected value of the game? It's infinite! 








EV = S (1/2) - 82 = #14914... = боо (333) | 
=] 


So, your preferences are not consistent with expected monetary value. More 
over, since you cannot actually value any game infinitely, it ig not possible 
to have any preferences at all that are consistent with expected monetary 
value, in this case. Hence, at least bere, expected dollars cannot represent 
vour preferences. Nor anyone else's really. 

What Bre we tū make nt this? BT) Served, 
and after him, that with increased wealth, added dollare ajia whatever your 
favorite currency is) are individually valued less (by most people) than dollars 
gotten in relative poverty. In short, $1,000,000 is worth а lot, but it is not 
worth as much as 1,000 - $1,000. What, then, is it worth? Modem utility 
theory, the most directly relevant theory to the issues at hand, does not tell 
us. Rather, it tells us to expect that different people will legitimately differ 
on this question, and it tells us how ta find out how someone values, say, 
$1,000,000. 

But this ig, a6 it were, getting ahead of the game. eng 5 back to our two 
questions, if we want, to take expected values of g mbles—chan 
purposes of decision making and if dollars are not always the e right wav to 
measure outcomes, if we are to take expected values, then what do we eat We 
assume а sort of theoretical, abstract, certainly hypothetical, currency, 
шу, Grounding all this are people's individual preferences. We should 
assure that references are more or less definite and concrete: people prefer 
this to that and are willing to-act convincingly to demonstrate it, Utility 
{в а numerical representation, or measure, of preference. The core idea of 
utility theory is that if a decision maker has preferences pertaining to some 
particular outcomes and those preferences meet certain specific conditions 
of rationality (more an this shortly), then there is a utility function on the 
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ores such that the decision maker prefers one lottery (gamble, chance 
node, etc.) over Миа iE and ants if the expected utility value of the one 
lottery is greater than that of the other, Put more simply, utility theory says 
that if you abide by the axioms (conditions of rationality) of utility theory, 
then there is always а way to measure the value of outcomes во that in the 
presence of uncertainty, taking the expected (utility) value of the decision 
trees is the right thing to do. If this is right, then indeed we have answered 
both of the questions that began this section. 

For the interested, here are the four basic assumptions of utility theory. 
| present them following the excellent and accessible treatment by Kleindor- 
fer, Kunreuther, and Schoemaker |9, Appendix A|. A notational convenience: 
suppose we have a lottery (gamble, chance node) in which outcome (or alter- 
native) A occurs with probability p and outcome (or alternative) B occurs 
with probability (1 — p]. We shall represent this lottery as: [p: A,B). Now 
to the four axioms of utility theory. 











l. Transitivity: For any three outcomes, A,B,C, ii а decision maker 
prefers Ato B. and B то C, then the decision maker prefers A to C. 
Formally expressed: I A> B and B C, then А > C, where @ > v 
reads “à is weakly preferred to i." 


2. Continuity: For any three outcomes, А, В,С, such that A+ B > C, 
there is a probability, p, such that the decision maker is indifferent 
between B for sure and the lottery [p : A,C]. Formally, expressed: 
В ~ јр: A. C], where "ë ~ w^ reads "ф is judged to be equally as good 
BS p.” 











3. Independence: positnm ip Ue and D, su 
A = B and С> D, then for any p, [p A, €] = [n : B. D]. 


4. Reduction: (^is Amer e A and B, and for all probabilities, 
ppu and p», [p : | А, В|, |р: А,В] ~ [r : ALB], where r = 
жетн АВ 





Nothing requires anyone to obey these axioms, but there iz a certain st- 
tractiveness to them. They seem, at least to many people, to be acceptable 
Tir ы ане ӨН; After all, can you really be rational if 
references are not transitive? 








There is very broad agreement that choice, if it is rational, must obey 
these principles so long as the outcomes are not affected by another agent. 
That is, when nature determines which outcomes occur, then expected utility 
iz, it is generally agreed, the right set of principles for rational choice. If, 
however, other agents may have a hand in determining which outcomes occur, 
then we have moved into the realm of game theory and the broad consensus 
on expected utility no long obtains, In fact, the problem of characterizing 
rationality in game-theoretic contexts remains at the frontier of contemporary 
research. 

But these are deeper issues and we must reluctantly move on. 


3.4.1  Eliciting a Utility Function 


If we can measure outcomes in utilities, Instead of say dollars, then we can 
he confident that using decision trees and the expected value criterion has 
а solid basis. The analysis of the decision trees proceeds just as explained 
above, except that values—of outcomes, of decision modes, and of chance 
nodes--are denominated in utilities. So if we can measure outcomes in terms 
of utilities we аге in а happy imstsnce. How, then, do we actually go 
about making those measurements? 

What we need is to transform the values of outcomes measured in dollars, 
oranges, square feet, of living space, and so on, into the гаш of utility. We do 
this bv finding a utility funetion that takes а quantity (of dollars, oranges, 
apumre feet of living space, or whatever) ва input and returns & number 
representing the шу of that quantity. For reasons that are beyond the 
scope af our present purposes, we cán set the utilities of any two outcomes as 
we please.” For convenience; we assume throughout, that the decision maker's 
preferences are monotone functions of the underlying argument and we set 
the utility of the worst outcome considered to be 0 and the utility of the best 
outcome considered to be 100, 

УВНЕНУ, each person's utility functim (in а given context) ls unique up to a positive 
liar transformation That ів, if {F(x} is a utility function, then for any other utility 
finetan, U'[z), U"(x) 48 a positive linear transformation of U[z) if And only if U"(z) = 
o+ fU (x), fur poma numbers a and 8, where 2 > 0. Temperature scales are positive 
linear transformations af each other, Fur example, C = (5/9). (£F — 32). With utilities, na 
with temperature scalas, we can establish arhitrarily an origin and a unit of measurement. 
Tihim, with utility we can use [0.1] or 0, 100] ar (4,2. т| ar whatever is convenient. 
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The utilities of all other outcomes аге assessed me we ask the 
decision maker several questions about hie or her preference 
a utility function from the answers given. ‘Typically, | we aunime A function 
of a particular form, we ask several questions, and we find and use the best 
fitting function af the assumed form. We сах however, elicit utilities for 
particular outcomes without making any assumption about the form of the 
underlying utility function. There are many ways of doing this. Let us look 
ai one. 

For the sake of concreteness, we will advert to the parking meteor problem. 
Suppose that instead of working with expected dollars, we decide to perform 
the analysis in terms of expected utilities. Our best outcome is 60,00, not 
plugging the meter and not get a ticket. Set the utility of this outcome 
to 100: (90.00) = 100. Our worst outcome (= —$15.32, getting n ticket after 
not plugging the meter. Because $15.32 ls such an awkward number and 
because we wild like tà bulla in some Bexibility (What if the postal rates 
go up?), we will set the utility of —$18.00 to Ü: 07918.00) = 0. Given this, 
we can construct a lottery for which Е can calculate ite expected utility: 
ip: w($0.00), u(—518.00)] = [p : 100,0]. The expected utility of this lottery 
ig p. 100 + (1— p) - zs р. LDO. So, йы 0.5, the expected utility is 50, 

We are now ready to ask our decision maker a question: "Dear Decision 
Maker, Suppose that you are presented with the lottery, 














[0.5 : $0.00, —$18.00] (3.34) 


This is not exactly а nice situation, since vou stand to gain nothing and 
could lose $18.00. The chance is even either way. Suppose, further, that you 
are stuck with this lottery, unless vou can get someone to take it away from 
von: What is the moet you would pay someone to take this lottery off your 
hands?" Utility theory says nothing about how our decision maker should 
answer. What is right is a matter of personal preference. What utility theory 
offers is this if the decision maker is willing to answer a short series of such 
questions, then a reliable model of the decision maker's preferences can be 
built and used in а large number of independent cases, Since these cases 
may be quite complex, it іє reasonable to hope that the model, built from 
the decision maker's judgments in fairly simple cases, will be more accurate 
on complex cases than the decision maker's own direct judgments. Such аге 
the hopes for this sart of thing. 
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The decision maker—who prefers more money to less—has a rational 
right to give us any answer between $0.00 and —518.00. Suppose the decision 
maker reasons as follows: “I'd like to pay as little as possible and in a market 
situation 1 would negotiate and shop vigorously, but 1 only have $20.00 with 
me today and if I pay mote than$12.00, 1 won't be able to have lumen and 
tide the hus home. So, that's the most I would pay. Anything more and ГЇЇ 
take шу chances.” What the decision maker 15 telling us is that he or she is 
indifferent between losing $12.00 for sure and facing the lottery, which has a 
utility of 50, So, u(-$12.00) = 50. Note that the expected dollar value of 
our lottery is — $8.00, so our decision maker is willing to pay more than the 
expected dollar loss in order to get rid of this gamble. In such conditions. 
we say that the decision maker is risk averse, If the decision maker were 
only willing to pay $7.00 to be rid of the lottery, he or she would be rsk 
seeking. Finally, H the decision maker's expected utility је identical with the 
expected monetary value of the lottery, ie- if the most the decision maket 
would pay to be rid of the lottery is $9.00, then we say the decision maker 
is risk neutral, Ш the decision maker is risk neutral, then using the utility 
function instead of, here. the dollar amounts in the decision tree will not 
affect the recommended decision. 

We are now in position to fit a functional form to our elicited paint. H 
the decision maker is risk neutral, then we would have a straight-line utility 


function, as follows | 
ufa) = 100. (5 ==) (3.35) 
{b—w), 


where r; is the value (here, dollar value) of the outcome in question, b is the 
value of the best outcome under consideration (here, b = $0.00) and w is the 
value of the worst outcome under consideration (here, ш = —$18.00), Thus, 
in the risk neutral situation, 


89.00 — (2818.00) Y 
u(-$0.00) = 50 = 100. (SRC) (36 


We can generalize Equation 3.35 in а simple manner, and accommodate risk 
aversion and risk seeking 








- | 
oe = Я m 
" | = 


u(z,) = 100. | = wj ) (3.37) 





where a is a risk aversion factor, When a = 1, the resulting utility function 
is risk neutral. When D < a =< 1 the utility function is risk averse, and when 
a > 1 the function is risk seeking. 

Suppose we choose to model with Equation 3.37, what has our decision 
maker told us aboot the value of a in this problem? We have 





—m [—512.00 — (—818.00)) * iu 
se tal | [$0.00 — |—818.00)) | (233) 
which reduces to 
_ In(t/2) _ 
= 0.63 3.39 
1/3). (99) 


Thus, our utility function for this (hypothetical! decision maker, for dollars 
over the range ВОЛИ) to = 18470 le 


wn f (751800) V 
ulz,) = 100 (Coes 


Applying this to the parking meter problem we find that the expe 
of not plugging the meter 18 


(3.40) 





ted utility 


(2.4 -:(—813.32) + 0.6 - и 80,00) = 0.4 - 30.12 4-0.6 - 100 == 72 
On the other 





hand thé utility of plugging the meter is 
u(—51.50) = 95 


So, again it is best to feed that meter. 

Finally, note that if the decision maker is sufficiently risk seeking, the 
expected utility of not plugging the meter will be higher than the expected 
utility of plugging the meter. We leave it as an exercse to the reader to 
determine just how risk seeking our decision maker would have to be to 
prefer leaving the meter unplugged. 











A short list of points will serve our purposes: 
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L. The elicited utility function is local to the model. If а broader range of 
dollars is at stake, the function must be reassessed. That is one reason 
it is usually a good idea to assume, as we did, a slightly wider range ol 
outcome values than is necessary for immediate purposes. 


2. The functional form we used i$ but one of many. The significant difer- 
ences are typically minor, for practical purposes, among the functions 
forms that are—es ours ie—consrantly increaging (or decreasing). 





3. Interpersonal comparisons of utility, e.g., Susan g utilities compared t 
Maggie's utilities, are theoretically nonsense, however much we would 
like it to be otherwise. Mp e.g, Susan's utility function iz a 
positive linear transiormation 10f Maggie's, the parameters in the trant- 
formation (a and 3) are и and unknowable.) 





4. Utility theory is very widely accepted as normatively sound. Still. it has 
its oritice, as well as its paradoxes and anomalies. It should be usd as 
any other tool, with caution, common sense, and lots of post-evaluatior 
[especinllv sensitivity) analysis. 


5. Elicitation of utility functions is something of an art. If you find that 
the functions elicited are multimodal or in other wave odd-looking; get 
some expert advice before relying heavily on the associated model. 





6. It is easy, and quite common, during the elicitation process for deci- 
sion makers to give answers that do not fit the chosen functional form 
exactly. When this happens, vou will need to fit the functional form 
approximately, with some error. Check to see that the errors are fairly 
small and not all on one side. 





3.5 Multiattribute Utility Theory (MAUT) 
Models 


Another question: 


e Expected utility theory seems useful in cases in which uncertai 
present and the outcomes cat be measured in terms of dollars or orange 
or square feet of living space or other simple denominations. But most 
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important decision problems involve outcomes are not so simple. 
Outcomes may have many aspects to them For example, 

a supplier, опе suppliér may be better on cost and worse on quality 
and delivery time. How are we to model these sorts of tradeofie: with 
expected utility theory? 


The question is asking about what, in the iargon of the utility theory 
literature. are called та ни тше decunon problems, in which the outcomes 
have associated with them several attributes, or aspects or dimensions, of 
value. Indeed, such problems are the norm. Just about anything one buys 
has multiple dimensions of value. In a car, the attributes include reliability, 
mileage, safety, and resale value. In an apartment, the attributes include 
living space, quality of the neighbors, commitment of the landlord, amenities 
in the neighborhood, location, and much else. 

As usual, we need to reduce things to а single number and proceed as 
before, So, we will represent the values of multiattribute outcomes to single 
(utility) values, and then proceed as before. There are basically two strate- 
gies. First, we might translate measures on each attribute to a common scale. 
such as dollars, and then translate that scale to utility, When this is natural, 
it ie a perfectly sensible thing to do, but often it ls not, and ane should use 
the second strategy: creste а multiattribute utility function. 

In the second strategy, we develop distinct utility functions for each at- 
tribute of interest. Such utility functions are called unidimensional utility 
functions, in order to distinguish them from multiattribute functions. Ear- 
lier, in the parking meter example, чеп saw how a single unidimensional func- 
tion could be developed. In the multiattribute case, we repeat this process 
for each attribute. Once the individual (unidimensional) utility functions 
have been obtained, we then combine them mathematitally to create a mul- 
tiattribute utility function. 

The simplest, and most commonly used. form of combinatie 
weighted average, or linear or additwe, model: 














U(Xi) = Крин 0) о а (зао hy: ty (Zin) = ум шжу) (341) 


A word on the notation. We the lower-case us, possibly 
cripted, to represent multiattribute utility functions. Thus 
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dimensional utility functions and there is one for each of the n dimensions 
or attributes at hand. The kja are relative importance weights. They range 
from Ü tó 1 and must sum to 1. A multiattribute outcome is represented by 
a: vector, X, = (xir альн) in which an individual entry, Zig; 18 the 
score, ог measure, ar description on dimension j for outcome 1. 

The weighted average model, Equation 3.41, has much 6а атасыны! 
it. lt is simple, it is intuitive, and it is robust. But it does make certain 
азар шон that may be incorrect and could invalidate tlie model. For 
presant purposes, we will say little about this issue expect to note that the 
modal requires a kind of independence that is easily checked (more dn this 
shortly}, 

In order to build а specific weighted average model, Equation 3.41, there 
are-several things we need to know, Неге is a systematic list, with discussion. 











The list is an augmented version of the SMART (simple multiattribute tating 
technique) procedure from Ward Edwards (6, 12]. 


1, Identify whose values are to be modeled. 


Whose decision is it? In whose name is the decision being made? This 
is an obvious and elementary fret step, but it is one often neglected. 





"а 


. Discover the purpose of the modeling exercise. 
What goals are relevant? Why are we doing this? Answers to these 
questions will serve to clarify and focus the exercise on the issues really 
at hand. 


3. Determine the alternatives to be evaluated. 
A reasonably precise determination of the relevant options, or entities, 
to be evaluated is important for many reasons, especially for determin- 
ing whet the relevant dimensions of value are. 


4. Identify the relevant dimensions; or attributes, for evaluating the alter- 
natives. 


This is A crucial and senaitive step. Avoid having too many attributes. 
A MED Es QUA MUN T 20€ likely to be suc- 

esigned. Choose attributes that can be measured in à 
meaningful and калтат way. Determine the values over which the 
individual attributes will range. For example, if the attribute can be 
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measured in dollars, choose a range of dollars that includes the values 
likely to be encountered in the options to be evaluated 


, Rank the attributes in order of relative importance, 


Ask the following question: 


Suppose we have n attributes, Consider m distinct options 
or Alternatives, Хр, X3....,.X,.. Now imagine that we cr 
ate the following hypothetical options for the purpose of the 
analysis. Option X; has the top possible scores on each of 
the n attributes, except attribute 1, on which X, has the 
lowest possible score. Option Ху hat the top possible scores 
on each of the n attributes, except attribute 2, on which X; 
has the lowest possible score; and similariv for the other ab 
Lernativees. How do you rank these n alternatives in order of 
preference? 


sine у азаа TE T ы ee 
the attributes, If A, has the highest (second highest, third highest, . . 
rank, then attribute ? has the lowest [second lowest, third lowest, н) 
relative weight. 

Comment: The questioning can proceed in other ways, and it is ad- 
viseable to try them as well. For example: 


Suppose we have n attributes. Consider п distinct options 
or alternatives, Жү. Ху... Жы. Option Ај has the lowest 
possible scores on each af the n attributes: except attribute 
1. on which X, has the highest possible score. Option X; 
has the lowest possible scores on each of the n attributes, 
except attribute 2, on which A> has the highest possible 
score; and similarlv for the other alternatives. How do you 
rank these п alternatives in order of preference? 


From the resulting ranking we сап read off the relative importances of 
the attributes, If X, has the highest [second highest, third highest, .. .) 
rank, then attribute : has the highest [second highest, third highest, 


The final rankings from these two question forms should be the same. 
If they are net, this probably indicates that the linear model ів inap- 
propriate. 

Note well: The question is (properly) asked with direct reference to 
the highest and lowest possible scores on each dimension. Setting these 
scores 15 done im step 3, We would-expect that different ranges on 
an attribute will result in a different relative weight for that attribute. 
If, for example, the range of possible scores on an attribute is small, 
then it is likely that the relative weight on that dimension should also 
be small. 





. Obtain ratio estimates of the relative weights, relative to the least imr 
Give the lowest-ranked (least. important) attribute:a 10. Now ask: For 
the secomd-lowest-ranked attribute, what ahould its-score be? Continue 
in this fashion for all the remaining attributes. 

. Normalize the ratio estimates of the relative weights. 

We jm aid idual k, values ERU recall, range from Ü to 1 and 
sum to 1) by dividing each estimated score (from step 6) by the sum 
of all ie score estimar es. 

Note: This is also a good technique to use in spreadsheet implementa- 


tions, since it allows us to do sensitivity analysis on any k, or combi- 
nation of &,s. 


‚ Obtain a unidimensional utility function for each attribute (for the 
scale and range determined previously). 

We have already seen how to do this in our discussion of the parking 

meter problem. 

Note: We assume that each utility function ranges from 0 to 100, but 


all that is really necessary is that all the utility functions have the same 
гапре. 


. Score each attribute of each alternative. 
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10. 


И, 


Recall that each alternative may be thought of as a vector of descrip- 


X, = (Tey, Dis, + saji Ein) 


Obtain those z,, descriptions, (Note: the z;; маше obtained in this 
step need to be numbers, eg., numbers of dollars, This does not pre 
clude ssgeasing the values of attributes that are not naturally measured 
numerically. Far from it. One merely needs to develop, sav, a phrase 
anchored scale that maps verbal аса to numbers. The details 
are beyond the scope of our present ponis. 





Calculate the utilities of the various options using a weighted average 
model. 


At this point, we have all the information needed to calculate the util- 
ities of the alternatives using the linear additive model, Equation 3.41. 





So, we make the calculations 


Perform post-evaluation anaivsis and decide. 

Prima facie, the best alternative is the alternative with the highest 
calculated expected utility. Before making a final choice however, vou 
should examine carefully what the model i is saying. How does the rank- 
ing of alternatives change with slight changes in the relative weights 
(Кв), scores on the attributes, and unidimensional utility functions’ If 
the changes make sense, tn that je ушр the пила ip valil. If the 
model ale exhibits а fair degree of robustness, then vou may be able 
to decide with confidence. 








3.6 Comments on the Use of Decision Anal- 


ysis 


There is very much more to this topic, but we have here a serviceable intro- 
duction. The next step is to build and implement same models, and we take 
that up in the sequel. 
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MAUT models are often, even usually, ШТ AAN еке 
When they are used with decision trees, it is appropriate ta model 
multiattribute outcomes. get their utilities from the MAUT odii and 
proeeed in the usual manner. Of course, this adds to the complexity 
of the overall decigion analysis and mandates careful post-evaluation 
analysis. 


. These (decision trees and MAUT models) are good methods and should 


be used more. In fact they are not used ss nearly as much as they 
should. Why? 


la) People tend to think that they are experts in making decisions— 
but they are not! 


[b) Computerization is needed for all but the smallest models. Good 

ackages are not readily avallabie and well understood. In the 

eel, we shall see how to build decision tree and MAUT models 
using spreadsheets 


[c] It takes time and effort to build decision anaiytic models, and the 
basic techriiques are unfamiliar to many people. Often, it is best 
to hire special consultants to help structure the decision and elicit 
information. 

(d) Extensive use of suhjechiue data, data elicited from individuals. 
This bothers some people (ms it should). Defense: this is the 
best we can do; and we then do sensitivity analvsis. At least the 
subjective judgments are up front and open. 





‚ А very important value: structulng the decision. Decision analytic 


models are helpful, if only for encouraging the deliberate, reflective 
structuring of the problem. Because the resulting models are public, 
they can be inspected, queried, and generally poked by all relevant 
stakeholders. 


3.7 Bibliographic Notes 


Recommended reading: from Bazerman [1], chapter 2, *Bias 


es." of Judgment 





in Managerial Decision. Making. 


46 


113 


114 


There are many excellent books and articles on decision analysis. Among 
them are: [2, 7, 11, 12] For information on, and a source of comfort for, the 
linear model, see [5]. 
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Chapter 4 


Notes on: Decision Trees with 
Spreadsheets! 


4.1 


Ош topic now is implementing decision analysis: with 


Introduction 





do we wanti from an implementation? 


L 


b. 


Calculate the expected value of the tree 


2, Indicate the optimal decision path 

3, 

4: Indicate invalidities (at least certain kinds of them) 
D. 


Facilitate sensitivity analysis 





Be easy to modify and maintain 


How are we going tà do it? First, here аге the essential spreadsheet 
skills with which you need to be familiar to perform the exerise of building 
в decision tree in this way. 


1, 


Layout and design of Excel (e.g, worksheets and workbooks) 


Pie: ct-dtree-with-se: Revised: 951222. 051128. 851022. From: (MISMNotes-dtreevith- 
аз. Revised: September 20, 1995. | 
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2. Presentation and formatting of Excel objects and entities (e.g naming 
worksheets, using attractive formatting, coloring things, setting pro 
tection ) 

3. Absolute and relative addressing 


4. Formulas:(e.¢., IF, MAX) 





5. Drawing and graphics 
6. Charta 
T. Goal seeking, data tables 





Шашу a decision tree DSS in Excel, version ».0 or later. (We 
focus on & рг шг spreadsheet product for concreteness, but the general 
lessons and principles apply to essentially all spreadsheet producta.) 

The usual strategy —with any spreadsheet product—is to focus on making 
the display look like a decision tree: Probleme: 


1. This is really forcing things in a spreadsheet, Lots of work. 
2: Hard tò maintain. 
3. Hard to manipulate 


Of course, when presenting results of an analysis it will almost always be nem 

ful to present the relevant decision trees to your audience, But, the decision 

tree display should be driven by a primary representation outside the tree. 
Another way: reduce to tabular form and work that way. Think of el- 

ements of the problem and organize this way, with elements of 

type on a single worksheet; (again: assume we are using Excel 5). 


1. Rename “Sheetl= as "Introduction" Put comments, a table of con- 
tents, and other overview text and information here. 


2. Rename "Sheet2" as "Decision Tree” Later, draw the decision tree on 
this sheet and fill in the values with names referring to other parts of 
the wurkbook. 
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3. Rename "Sheet3" as "Input Parameters.” Put all input parameters on 
this sheet, named and laid out as described below, $4.3. 


4, Rename "Sheet4" as “Interior Results." Put all formulas for interme- 
diate resulte on this sheet, named and laid out as described below, 
£44. 

5. Rename "Sheet5b" as “Chance Modes" Put all chance nodes, repre- 
sented ns tables, on this sheet, named and јаја out as described below, 

6. Rename “Sheet” as “Decision Modes" Put all decision nodes, repre- 
sented ag tables, on this sheet, named and laid out as described below, 
64.6, 


This will serve ss a basic template for ив, tà be revised amd expanded leter. 


We will begin by working on the parking meter problem using the decision 
rule of expected dollars. Utilities come later. 


xn 


T 
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4.2 Introduction to Decision Tree Imple- 
mentation 


Below, in 554.3-4.5, we will give some essential information an how to imple 
ment a decision tree program іп а spreadsheet for the parking meter problem. 
What follows should be seen as а guideline. [t is meant to be helpful, not 
to be complete. In ап actual implementation we would need to add паге, 
for example. alerters to indicate the optimal decisions and to indicate mvalid 
data (parameter) values. 

Think, lor starters, of your program as being divided up into a serie: of 
tables, as follows, You should be able to figure out appropriate locations in 
the worksheets for the various names and labels. 

And, recall the basic tree for our parking meter problem, Figure 4.1. 





Figure 4.1: Decision Tree for the Parking Meter Problem 


19 


4.3 “Parameters” Sheet 


Table 4.1, lays out the input values for the model's parameters. 


Get a tle A qom 
Don't plug meter and don't get a ticket 


Plug meter 
Relative probability of getting а ticket | 
| Relative probability of not getting a ticket 





Table 4.1: Parameters for the Parking Meter Problem 


Mate; the symbols should all be defined as names for the values to their 
left, Ава, AA. ВВ, and CC are outcome values. Ц is a good idea to label 
them as such in some way. Finally PP and QQ are relative probabilities 
only, Ag parameters, the values of either or both may be changed in this 
table, e.g., during sensitivity analysis. 

Admittedly, the AA, ВВ, ete: notation апі naming scheme here could 
be improved. You should think of mmenonic names; e.g., getticket instead ol 
AA, and relprobticket instead of QU. 
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4.4 “Formulas” Sheet 


Table 4.2 which presenta the formulas used in this model (there happens to 
be only two formulas used). These formulas, named "probticket," and “prob- 
noticket," calculate the absolute probabilities of getting (resp., not getting) 
a ticket, H the meter is not plugged. 





| Meaning 
| =РР/СРРА00) | probticket | The probability of getting a ticket 
=90/(PP+QQ) | probnoticket | The probability of not getting а ticket 





Table 4.2: Formulas for the Parking Meter Problem 


4.5 “Chance Nodes" Sheet 


In the parking meter problem we have only one chance node, D. whith will 
le named DD for the sake of Excel. See Table 4.3. 


Rr 
sprobnoticket | 





Table 4.3: Chance Nodes: Layout and Key Formule 


With such a table, pick а convenient cell and put into it à formula that 
calculates the expected value. Hint: use the sumproduct function. Name 
this cell DD. 


4.6 “Decision Nodes” Sheet 


in the parking meter problem we have only one decision node, E, which will 
be named EE for the sake of Excel. See table 4.4. 





Table 4.4: Decision Node EE: Layout 


Note that the formula, «CC, simply refers to the value of the parameter, 
CC. bv the magic of naming. Further, we will name the expected value of the 
DD node as DD. for then the formula, «DD, will refer to the expected value о! 
the CC node. (Again: This naming convention could be improved and you 
may want, ag., to begin the name of each decision node with а *D" and each 
chance node with в "C" followed by some indication of what the daughter 
nodes are. | 

But we need two mare things from the EE node implementation: 


1. The expected value of the node. This is just the maximum (use the 
=nax function) of the values of the daughter nodes. You should create 
a formula for this in 8 cell and name the cell EE 


2. The паше of the daughter node corresponding to the optimal dacision. 
There are several ways to handle this. Here's one. Add a new cal 
umn to the decision table, as follows: Here, we are assuming that the 





= See 


| | MTF OMAX(SE85 : $E96) ES, 1,0) | 





Chaice 


“Value” column is in column E of the worksheet, that the =Co formula 
ic in cell ES, and that the =DD formula is in cell Еб. Now, in some 
convenient cell, say cell G4, put in the formula =05, and name this 





cell, вау, eebastdaughter. Then, write a macro that sorts the table's 
range, DS:F6, (or better, uses a name given to the table) in descending 
order. After vou run the macro, cell G4 will contain the name of the 
best daughter cell from decision node EE. Later, you can assemble all 
such macros and attached them to а button. 





4.7 Discussion 


Much more can be done. Here are some suggestions: 


1 


On the “Decision Tree" worksheet you can draw the decision tree and 
label its elements using the existing names on the other worksheets. 
That way, when vou alter the assumptions, eq, the value of BB, the 
drawn decision tree can be more or less automatically updated. 


On the "Introduction" werksheet, vou can describe the DSS that follows 
and how to work with it. 





presentation to make it friendlier, easier to use, and easier to ОЕГАН 
Example: Protect the worksheets from changes by users, except for the 
cells holding parameter input values, and ббс неде treen. 








. You ean do sensitivity analysis hy asking what-if questions. Go to 


the "Parameters" worksheet and change values. dee ra the decisio 
changes, Га АП. You can ајао do more sophistica: 
data tables, graphics, and goal seeking cc&mandi. 






; You can alter what has been done so far to create a more elaborate 


. You can alter what has been done so far, and measure the outcomes in 


utilities, rather than dollars 


And there is much more. Be creative! 


123 


Appendix A 


Visual Basic for Applications: 
A Brief Tutorial! 


A.l First Steps 


Visual Basic for Applications (VBA) is the macro language for Excel It 
closely resembles Visual Basic, an independent language from Microsoft, and 
ig used as the macro language for Microsoft Access and Word. In what fol 
lows, we will be talking for the most part about Visual Basic for Applications 
as it applies to Excel. We will feel free to call lt VBA. EVE, Visual Basic, 
VB, etc., so long as the context makes confusion unnecessary. 

Macros consist of one or more VBA code chunks. These code chunks— 
procedures—are either functions or subroutines. Here are some simple exarn- 
ples. 


"Here is a simple function. Use as any other Excel function. 
Function bobíx! 
bob = Х ` 2 + 3.33 
End Function 
' Here is a simple УВА sub. 
Sub ted() 
MasgBox "Hello, world!" 
End Suh 


‘Vile: dt-vbatutar. Created: 051128, from VBTUTORF.DOC. Revised: 051222. 
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Figure А 1: Help Menu for Excel 5.0 


Note: comments begin with a single quote: 
' Everything afterwards in the line is ignored. 


In Excel, УВА macros reside on special workbook sheets, called modules. 
То make à macro, one may simply create в new macro module and type in 
the functions and procedures. More on this shortly. 

Information about УВА is published in many resdily-available sources. 
Both Microsoft and third-parties publish extensive reference manuale and 
how-to books for VBA. [n addition, VBA closely resembles Visual Basic and 
there is à large literature on that. For good online help on VBA in Excel, 
explore "Programming with Visual Basic" in the “Contents” window of the 
MS Excel help facility (see Figure A.1). We are assuming in these notes that 
the reader will do this: 





A.2 Second Steps 


A.2.1 Recording Macros 


Macros (УВА procedures) can be recorded, Use Record Macro under the 
Tools mem. After selecting Record New Macro, vou will be prompted for 
the name of this new macro. Either give it a new name, or accept the default 
A small window will then appear with а stap button in it. You click the stop 
button when you are done recording your macro. First, however, perform as 
usual some action in the workbook, eg. copy one range of cells to another 
place. When you are done, stop the macro recorder by clicking the stop 
button. In sum, there is a four-step process to record а macro: 


1. Start the macro recorder. 
Do this: by selecting the menu: Tools / Record Масто / Record New 
Мато. 

2. Name the macro, 


You will be prompted for a name and may accent the default presented 
by Excel, eg, Macrel. Once you have done this, a window appears 
with a button for stopping the recording of the marco. 





3. Record the macro by performing normal activities in the workbook. 
It is wise to plan these out before starting to record. 


4 Stop гаса din r the macro. 
Do this hy clicking the stop macró button. 


‘This creates УВА code in n (usually new] module sheet, which Excel will 
call Modulel or some such thing. Module sheets reside with the other sheets 
of the workbook. As with the other sheets, you click on the tab to view 
the module sheet. When it appears, you will.see VBA code against a blank 
background. While worksheets present spreadsheets (arrays of cells), macro 
sheets present text editors. Thus, you can examine and edit the VBA code. 

Notice, in particulate, a couple of things with regard to your new macro 
module sheet. First, macro sheets come with a confert-senatiive text editor. 
Fur example, comments (lines beginning with an apostophe) come out green 











бб 
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(by default) and reserved words come out blue and ket capitalized automat- 
ically. Second, the new macro that you just tecorded is a Sub, rather than а 
Function. 

Recording macros and examining the results is в good way of learning 
about VB, but it takes you only so far. We need to go further. 


Assigning a Macro to a Button or Graphic Ob- 
In order to run (or execute] à Sub macro, including macros created with the 
Record New Macro facility in Excel, one can doom U to наста the macro to 
& graphic object that сап call the macro. Assigning a macro to а uttan en 
тИ object is easy. Гог a previously-existing object, select it (e.g; hold 
down the Ctrl key and click on the button or graphie object). then choose 
Assign Macro... from the Tools menu. You will be prompted with a list of 
existing Subs and you make your choice from the list. That done, vou may 
now simply click on the graphic object and Excel will call the macro and 
cause ТЕ to be executed. 
Note: Typically, you will want to create anew button and assign the 
macro to it. Use Create Button from the Drawing iron and draw a new 
button. Excel will automatically prompt vou to assign a macro. 





A.2.2 








A.2.3 Functions versus Subs 


VBA functions return values (one value each), but cannot take actions oth- 
erwise. VBA subs (subroutines) do not return values, but can take actions. 
(However, VBA subs can set the values of varinhles and these variables may 
be accessed by other procedures.) УВА functions, once defined in а macro 
sheet, may be used in worksheet cells just as any of the functions Excel has 
built inte it. 

Functions and subs may call one another, thus vou may create very com- 
plex programs in VBA. We will discuss that later. First things first. Now, 
let's look at variables in VBA. 
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A.3 Variables 


A.3.1 The Very Basics of Variables 
Here is a simple example involving VBA variables: 


' Here's an example of two variables in use, along with 
| a For...WNext... control loop 


Sub variahleExamplell) 
' Assign the number 3 to the variable, MyVariable 
° Note: Tou make up your own, пшёшоп1с, 
' namaz ior your variables 


MyVariable = 3 
For 1 = 1 To MyVariable 
MsgBox “Showing and counting: " È i 
Next i 
End Sub 


The two variables are: MyVariable and I. 

Such program variables are weed extensively in this sort af programming, 
Variables bold values and their values may change during program execution 
Basically, you make computations and assign the results to variables. Then 
you make new computations, based on the assigned values of these variables, 
and you assign the results to other variables. And on and an. 


A.3.2 Variables Have Data Types 
Some variables are for holding numbers, some for text, some for dates, and 
so on. VBA has a special type of variable, called the variant type. It can 
hold about anything, but in general you should avoid being so loose. 

The main dats types in VB are 

|. Boolean. Values: True or False 

2. Integer. Values: -32,768 to 32,767 


3. Long (integer). Values: -2,147,483 548 to 2,147,483,047 
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4. Single {single precision floating point), Values: [lots] 
Double. Values: lots more than singles| 


‚ Currency. Values; [lots] 


-ї Xh єн 


Date. Values: January 1, 0100 through December 31, 9999 
. String. Values: 0 through 55,535 chatacters 





. Variant. Values: Any numeric value thru Double or any charac 


You set the data type of a VB variable by declaring it. But, if vou don't 
declare the data type for а variable (ва in the variableexamplel procedure, 
above], then the default iè that the variable is of type variant. 

Within a procedure, you may declare variables with the Dim (dimension) 
statement 


' Haw here's variableExamplel again, but 
' with the variables properly declared 
Sub variableExample2() 
' Assign the number 3 to the variable, MyVariable 
t Note: You make up your own, nmemonic, 
' names for your variables 
[Dim MyVariable As Integer 
Dim I As Integer 
Муагјађје = 3 
For I = 1 To MyVariable 
MegBox "Showing and counting: " à I 
Next I 


A.3.3 Local and Global Variables 


Variables declared this way (explicitly in a procedure with Dim or as variant 
by default) are local to the procedure. That is, vou can't, refer to them—tuse 
their names and get their values—in other procedures, Їп fact, as illustrated 
in variableexamplel and variableexample?, above, you can actually reuse 
the same variable names in different procedures. When vou do this, vou are 
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really working with different variables, which happen to have the same names. 
(Advice except for counters, like I, and explicitiy temporary variables, eg., 
mytemp. don't do thia.) 

Point of style: It is normally considered good programming practice to 
declare all your variables explicitly. Why? In Visual Basic, you can enforce 
this ћу decaring 


Option Explicit 


in the declarations section of each code module. (The declarations section of 
а. module is the-space before the first procedure-1.e., at the top.) You should 
do thie Then, when VB encounters a variable that hasn't been declared, VB 
generates an error message, This may initially he irritating, but it’s a very 
ecd idea in the lang гип, since it prevents otherwise undetected errors, 

The scape of a variable need not be limited to being local, however, In 
УВА m Excel the scope of ë variable may be the procedure in which it is 
declared (in which case we say it is local), the module in which it is declared, 
or the entire workbook- 

When the scope is to be local (within procedure only), declare variables 
at the beginning of the procedure with the Dim statement. (See also: in 
the Help facility: the Static statement.) See example: above, procedures 
yariableExample! and variableExample2. 

When the scope of а variabile is to be the module in which it is declared, 
declare the variable at the top of the module (in the declarations section), 
using Dim. (See also in the Help facility: the Static statement.) 

When the scope of a variable is to be the entire workbook, pick а mad- 
ще, and declare the variable in the declarations section using Public (cf, 
Global]. 

Here's an example: 





' E&ch Module begins with a declarations section, the 
' portion at the top, before the procedure declarations 
' begin. 


! Declare explicit data type checking 


Option Explicit 
Public MyVar As Integer 





Т0 
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Sub publicExamplel() 


MyVar = 17 
MsgBox "We're in publicexamplel and MyVar = " E MyVar 
‘publicexample? 

End Sub 


Sub publicExample2() | | 
MsgBox "We're in publicExample?2 and MyVar = " k MyVar 
End Sub 


Note: "Module-leve] variables remain in existence while Visual Basic is 
running until the module m which they are defined jz edited" (Visual Basic 
User's Guide, Microsoft Excel 5.0, p. 121). So play around with this example 
and see bow this stuff works. 


A.3.4 Reading from an Excel worksheet into an Excel 
Visual Basic Variable 


Study these examples 


Sub readfromvorksheet1{) 
Dim fromvorkeheet 
' Note that with Cells(1,2) we are referencing 
' tha first row and second column of thea worksheet. 
fromworksheet = Worksheets/"Sheetl").Caells(1, 2),Valus 
' Tha following lina worka just as vell. 
'fromworksheet = Worksheets("Sheeti!).Range("bi"). Value 
MsgBox "We're in readfromuorksheestl and fromworksheet = " kk _ 
fromwvorksheet 
' Note above, ume of " " as a continuation aign. 
End Sub 


Sub readfromworkasheetzi) 
' How assume we have defined a range, called testrangel, 
' whose 
' gcope is B2:D4 
Dim fromworksheat 


' Note that with Cells(1,1) we are referencing 

' the first row and first column of the namad rangs. 

fromworksheet = Range("testrangei"J.Cellsil, 1) Value 

' The following line works just as well. 

'fromworkshest = Worksheets("Shssti").Bange("bi").Value 
MsgBox "We're in readfromworksheet?, fromworksheert = " R _ 

framworksheet 

' Note above, use of " " as а continuation sign. 

End Zub 


A.3.5 Writing from an Excel Visual Basic Variable to 
a Worksheet 

Just switch from left to right, eg., 

Worksheets ("бек 1") Се ас, 2).Vaiue = fromvorksheet 


The equal sign, =, in this context is an assignment statement. It puts the 
stuff on the right into the stuff on the left. 





A.4 Boolean Operators 
Often we have to test for the truth or falsity of an expression, for example 


MyVar > 7,8 


will be true if MyVar has а value that is greater than 7.3. Ш ite value is lese 


than 7.3 the expression will be false. Note: If MyVar is Null, then the expres- 
sion evaluntes to Null. See comparison operators. This greatly complicates 
things and in these notes, ГЇЇ ignore the question of mulls. 

So, expressions: may be either true or false, in which case we say they 
have truth values. Expressions having truth values may be combined using 
Boolean operators to yield larger expressions, which also have truth values. 
The Boolean operators available in. VB are: And, Or, and Not. 

Each of these operators has a characteristic truth table, as follows. 


T2 
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Fable A3: Truth Table for Not 


Interestingly, many other Boolean (truth-functional) operatore are pos- 
sible. That is, there are a lot more other truth tables possible. But, these 
three suffice in that with them-any other possible Boolean (truth functional | 
operator may be defined. (How would уси prove this’) In fact, Not and And 
аге sufficient in this way, as are Not and Or. Here's something of a proof. 


Not(Not expl Or Not exp2) 





Table A.4: Truth Table Showing Definition of And tn terms of Not and Or 


Can vou think of a single Boolean operator that is by itself sufficient? 

So, we often need Boolean combinations of statements (or expressions) in 
programming. The bottom line is that And. Or, and Not are sufficient for 
expressing anything we can рова у express in this way. 


A.5 Control Structures 

There nre several of these in Visual Basic, and we'll look at n few of them. 
(And vou should search the online help under "control structures."] We have 
already seen one, thé For... Next statement. 

A.5.1 For...Next 


We've already seen n in action (above). The general structure for à 
For.. Heart statement js: 





For <counter? = €start» To «end» [Step <increment>] 
[statements] 
Next [€counter?] 


Note: Items in square brackets, [...], are optional [tems capitalized are 
required parts of the statement. Items between left and right angle brackets, 
<...>, are required to be filled in by the programmer. Thus, valid examples 
for the For... Mext statement include the following. 


For I = 1 To 3 
MsgBox "Hello, world!" 
Next 


Better stvle is tà do this: 


Tå 
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For I = 1 Te 3 
MsgBox "Hello, world!" 
Next I 


Or you can count down, if, eg., MyIncrement is negative. 


For MyCounter = MyStart To MyFinish Step MyIncrament 
MsgBox "Мусошњег = "$ k MyCounter 
Hext MyCounter 


Note: Be sure all these variables have reasonable values set for them before 
executing this statement. 


А.5.2  If...Then... 


This is a very. useful statement in programming languages: The basic struc- 
ture in VB is 


If <condition> Then 
(statements) 
End If 


When an If...Then... statement is executed, the <condition> is tested 
as n Boolean expression. If ít evaluates to True, then the [statements] are 
executed; otherwise they are skipped and processing continues with the next 
statement, if any. 

Note: The <condition> can also be an expression that returns a numeric 
value. If when evaluated it returns 0, that is treated as False. Anything else 
је treated as True. 


| РШ Г LJ 





If Age >= 65 Then 
HumberÜfDeductions = NumberüfDeduüctions + 1 
End If 


Note: The «condition» expression may be complex. It may be-an arbitrarily 
complex Boolean combination af statements, 
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А.5.3  If...Then...Eise 
Probably used even more often than If... Then... 


1f <condition> Then 
[statements to execute if <condition> is true] 


Else 
[statement to execute if <condition> ia falsa] 
End If 
You use 1f...Then...Else when you want to do one thing if a condition 


obtains. and another if it does wot obtain. The =If(...) function in Excel is 
ап If...Then...Else type of constrict. Example If the value in a certain 
cell (or variable} iw valid, (hen display an ОК message: otherwise display в 
not (OR message. 


A.5.4 Select Case 
More general than IT...Then...Elseis Select Case. 


Selact Case «test expression» 
Саве <first expression list» 
[first statements) 
Сава «second expression list» 
[second statements]... 


Case Else 
[else statemants] 
End Select 


Here's an example from the SuperBook: 


Select Саве TotalFoints 
Саве In < 50 
FinelGrade = ТЕ" 
Case Is € 60 
FinalGrade = "D" 
Case Is « TO 
FinalGrade = "C" 


136 


Case Is < HO 


FinalGrade = "В" 


Case Else 


Ғіпа10гаде = "A" 


End Select 


This runs, but there's à lot that's wrong with it. The following is much 


better. Why? 


Sub testcazazí) 
TotalPoints = 173 


Select Case TotalPoints 


Сава 0 To 50 
FinalGrads 
Case 50 To 59 
FinalGrade 
Case 60: То 69 
FinalüGrade 
Case TQ To 79 
Finalürade 
Case BO To 100 
Final&Grads 
Саве Else 
FinalGrade 
End Select 


= 


"Ен 
"p" 
um 
ng 
"An 


«Error in TotalPoints: " Ё TotalPoints 


MagBox "Final grade is: " & FinalGrade 


End Sub 


A.5.5 Do...Loop 


There are really two forms of Do... Loop: condition-at-the-top and condition- 
at-the-bottom. Неге they are: 


Do {While | Until} «condition» 


[statements] 
Loop 


and 


Do 
[statements] 
Loop [While | Until) «condition? 


where 
(While | Until) 


pets unpacked as either While or Until. While <céndition> means so long 
га the condition is true, and Until <condition> means until the condition is 
true. The difference between the condition-at-the-tap and the condition-st- 
the-bottom versions hes mainly in that the condition-at-the-bottom version 
(5 guaranteed to execute its [statements] at least ance. 


A.5.6 Exiting a Loop 


Sometimes vou need to break out af a loop. (Dont we all?) If you're in à 
For...Next structure, break out with an Exit For statement. If you're in 
а Dao:..Loop, break out with an Exit Do statement, Note: sometimes you 


haw: to do this, but it's generally considered poor programming practice. 
Why? 


A.6 Arrays 


Arravs in VB should not be confused with arrays and array commands in 
Excel even though Excel's terminology invites this: All standard Цига 
generation programming languages support arrays, and programs in these 
anguages typically rely а lot on arrays, Arrays are rather like vectors and 
matrices in mathematics. A one-dimensional array is an ordered collection 
of values, rather like a vector, which you can access (store or retrieve values) 
hy position. Here's а simple example. 








+ From “Code Module5S" of vbtutor.xlsa 
Sub arraytesterl() 

Dim I, MyFirstArray(! To 6) As Integer 
' Load up the array 

For 1 = 1 To 6 


137 


MyFirstárray(1) = I + 3 
Нах 1 
MsgBox "MyFirstArray(8) = " k MyFirstArray(8) 
' Dump the array into a worksheet 
For 1 = 6 To 1 Srep -1 
Worksheets("Sheeti"),Cells(I, 6).Value = MyFirstárray(I) 
Next I 
End Sub 


Note You declare an array in much same way you declare any other 


variable. (But see ReDim in the online help.) All of the elements in an array 


must have the same data type. Of course, if the array is of type variant, this 
is pretty loose, (But you can't have, e.g., arrave within arrays in VB.) 
Here's.8 more interesting exemple, using a two-dimensional array 


Sub arrayteatarZ() 
Dim 1, J As Integer 
Dim MySecondArray(1 To 10, 1 To 20) As Single 
! Load up the array and dump, forcing 
' type conversion from Integer to Singie 
For I = 1 To 10 
For J-= 1 To 20 
MySscondArray(I, J) = 5181 + J) 
Worksheeta("Sheat2").Cells(i, J). Value = MySecondArrayiI, J) 
Wext J 
hert I 
End Sub 


We can go on the high-dimensional arrays, but I think you get the idea. 
In Excel VB programs, you typically only need one and two-dimensional 
(maybe three-dimensional) arrays. 
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A.7 Dialog Boxes in Excel 


A.T.1 Creating a New Dialog Box 
Begin by creating a new Dialog sheet, See figure А.2. 
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Figure A.2: Creating a New Dialog Sheet 
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You then get a dialog box and the forms menu/pallet. See figure 
Ка, 
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Figure A.3: Dialog Bax Form and Menu on the New Dialog Sheet 
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Choose List Box from the Forme pallet and draw a list box an the di 
bax. See figure A.4. (Double click on selected object, the list box, or cl 
Format | Object...) 
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Assign links to the list box. See figure А.5. 
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Test by choosing Run Dialog from the Forms pallet. See figure А б. 
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Figure А.б; Testing the List Box 
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Figure А.7: The List Box in Action 


Click an the OK button. Notice what is written to the Cell Link cell: the 
item number selected (not the value selected), 
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A.T.2 Calling a Dialog Box from a Sub 

Now, write a subroutine (sub) that, when executed, runs and displays this 
dialog box. Go to а собе module sheet, Try this: 

Sub displaymyfirstdialogí) 


If Dialogdheets("Dialogi").Show Then 
МерВох "The user clicked the ОК button." 


MagBaox "The user clicked the Cancel button." 
End Sub 
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A.7.3 Adding a Button to a Dialog Box 


Now let's see how to add a button to n dialog box. Click on the Create 
Button tool from the Forms pallet. Draw and edit à new button. Leave it 
selected See figure AS: 





Figure A.$: Adding a New Button 
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Choosing Tools | Assign Macro..., assign а macro to the selected bit- 
ton. See figure ALG. 
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Figure А ©; Assigning a Macro to a New Button 





АТА Upward and Onward 


This completes the very basics: You should play around and discover what 
else can be done. 


A.8 Miscellaneous Topics 


Now we'll discuss a list of useful things, things—methods and tricks—that 
didn't, fit easily in the previous discusaion. 





A.8.1 Constants 


Constants are like variables, except that they don't change. You use con- 
stanta in order to improve the readability of your program and to help reduce 
errors. For example, if the maximum number of students in a classroom is 
132, and you need this value alot in your program, then you might want to 
consider declaring a constant, You might do this: 


Pubiic Conat maxstudents As Integer = 132 


Then, throughout your program, you can just use maxstudents, without 
having to worry about typing 132 or making a mistake and typing some 
other number. (Recall: Option Explicit.) 


А.8.2 The Copy Method 


Suppose you wish to copy one worksheet range to another worksheet 
You can do this in Excel VBA with the copy method. For example: 





Sub conytesti() 
' Suppose "carol" ів the range 83:04 on Sheet and 
' "alice" is EA:FB5 on BShaeet3. 
* The following works: 
Worksheets("Shest3").Range("carol'").Copy . 

destinstion:-Worksheets("Sheet3").Range("alice") 

' And во does this: 
WorksheetB["Sheet3").Range("carol').Copy _ 
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destination;-Workshests("Sheet3").Cells(8, 8) 
' and so does this: 
Worksheats("Shest3") Range("h3:cá").Copy _ 
destination:-Worksbeets|"Sheet3").Cells(10, 2) 
End Bun 





A.8.3 Referring to Single Column or How Ranges 


Suppose the name denise refers to & range consisting of a single column. 
Then Range("denise").Cells(1).Valus refers to the value in the topmost 
cell in the range. 


Sub demacelisi() 
у = Rangel "denisg ы ) Celis (1 ) .Value 
y = Rangel("denise").Cellsi2).Value 
MsgBox "x = "Rx bk" and y = " k y 
End Sub 


A.8.4 Sorting Worksheet Ranges 


See the sort method. In Excel VBA you can direct the sorting af a worksheet 
range. For example, the following subroutine sorts the range, DaRange, in 
worksheet, DaWorkSheet, on the column, DaColumn, in descending order. 
Suh sorti) 

Workshests("DaWorkSheat").Range("DaRange").sort _ 
keyl:-Range("DaColumn"), orderi:*xlDescending 
End Sub 


A.8.5 Calling Subroutines from within Other Sub- 


routines 


A reasonable and normal thing to do. In fact it's recommended. Suppose 
you had a main subroutine, called main, and you wanted it to call three other 








subroutines, named mysubl, mysub2, and mysuba. Here's how: 
Sub main() 
Бузи 
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mysub2 
mysub3 
End Suh 


A.8.6 Calling Functions from Other Procedures 


aghtforward. See the bob function ar the start of this appendix 
Then, here's an example. 





Function bobagain(x) 
bobagain = bobix) = bob(x) 
End Function 


And that's about it, 
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Appendix B 


BasicGA: Code for Genetic 
Algorithms! 


Ва Introduction 


The purpose of this appendix is to lay out and discuss the cade for Basics A. 
BasicGA is а program having some (very) basic genetic algorithm capabili- 
ties. It was written in Visual Basic (Microsoft) and works in the Visual Basic 
for Applications (VBA) dialect. BasicGA works in Excel The purpose of 
BasicGA is tà serve as a shell or starting point for developing applications ns- 
ing genetic algorithms, especially in & classroom environment. My intention, 
thus, has been to maya BazicGA as implementation-independent ag possible 

Note: For program development, debugging. and other purposes, I have 
often substituted a stub routine in BasicGA for what would be gn actual ron- 
tine in a full application.” Such stubs will have "stub" appended to the actual 
name, sometimes prefixed and sometimes postined. For example, the stub 
for InitializeGA is InitializeGAStub but could be StubInitializeGA. 
Also, in this documentation, names of eode objects, е.р., procedures and 
variables, will be given in a typewriter font, as we have just seen with 
InitializeGA. 

File dt-basicga.latex. Created: November 27, 1995, from clami-code. latex. Rarrised- 
851222, 951128. 


^A МЕ НЕМ epe Greer in final form, but is used during program 
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B.2 Declarations 


The purpose of this section is to describe the declared, global or public, 
variables for BasicGA. Here follows the declarations section of the BasicGA 
code. It is written in Microsoft Visual Basic 3.0 and has also been tested in 
the Microsoft Excel 5.0 environment. 


' вок 951127; This is in file: GACUDE.BAS in 

' the folder сјаророз 

' Am freezing this for nov. Call it the BasicGA program, 
' version 851127. 


Gption Explicit 


Ü g m 9 GF W OE S G GOR B & GB GO dk G G BO B QO G ü a 8 G G GQ ü 8 GŠ G GR G š G HB G BE BE ваф dh: 
'ai*setaasesasas* Balov, Constants declared that «ee 
bo--*xG xd txi£zas should bo read in, sarztkttuzki+u pq k 


' gok 951126: Note: These program variables, 

" in a non-stubbed 

environment, need to be declared in the declarations 
' action. They are во declared, but I have commented 
' out the declarations (Eee below). 


'++++ from GetGARumnPars —+++++++++ 


Const NusberüfGenerations = 20 
'GetNumberlfiGensrations 

Const Populationdize = 100 
'GetPopulationSize 

Const Crossoverhate = .T7 
'GetCrossoverhHate 

Const MutationMate = .23 
'GetMutationRata 

Const bestHSaved = 100 


. 353 


'GetBestHzaved | 
m"———————— RR 
| HHHH from GetModelhRunPars 44] 

Const NusberüfDecisionVariables = 4 


'GetDutputSize 
Const DutputSize = 2 





dor R disait 


TAE Ра 
"мане fram/for InitDVarinfo/StubInitDVarInfo +++++ 


Dim DecisionVariableInfa(1 To NumberüfllecisionVariables, 
= | Te 4) Aa Double 


Const DecisionVariableInfoll = 5 ‘r, low 

Const DecisionVariableInfol? = 20 ' т, high 

Const DecisionVariableInfoi3 = Q ^" r, not integer 
Const DecisionVariableInfoi4 = б ' r, no grid search 


Const DecisionVariableinfo2i = 10 'v, low 

Const DecisionVariableInfo22 = 30 ' v, high 

Const DecisionVariableInfo23 = 0 ' v, not integer 
Const DecisionVarinbleInfo24 = 0 ' v, no grid search 


Const DecisionVariablelInfo31 = 15 'u, low 

Const DecisionVariableInfo32 = 25 ' u, high 

Const DecisionVariableInfa33 = Ò ' wu, not integer 
Const DacisionVariablaInfo34 = D ' u, no grid search 


Const DecisionVariableInfo4i = 200 '1, iow 

Const DecisionVariableInfo42 = 300 ^" 1, high 

Const DecisionVarishleInfo43 = б * 1, not integer 
Const DecisionVariableInfo44 = 0 ' 1, no grid search 
| ee У _—_______ „опала на pet ++ 


hkzzssstyRRPARRmxkkydARhskdkXxGbkXxixubeshu жжке $ h 


ER! 
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азафявазазезаза Above, constants declared that === 
Калаке should be read in, 4s2224424244225 


' Global variables 


"++ from GercARunPars 
' =» but explicitly declared above +++++++++++ 


'Dim NumberOfGenerations as Integer 
‘Dim PopulationSize ав integer 
‘Dim CrossoverRate As Double 

‘Dim MutationRate As Double 

‘Dim bestHSaved as Integer 


| сене from GethodelRunPars 
‘ope Put explicitly declared above +++++++++++ 


Dim NumberUtDecisionVariables As Integer 
Dim lutputSize As Integer 


= 


Мати века 


Dim Index As Integer 

Global CurrentGeneration{) As Double 
Global AbsoluteFitness() As Double 
Dim ChromosomeCopySpacse() As Double 
Dim RelativeFitnees() As Double 

Dim Crossoverlikelihood() As Double 
Dim BestNCurrentSaveSet() As Double 
Dim LowestAbsaluteFitness Ав Double 
Dim HighestAbsoluteFitness As Double 
Dim CurrentidNum As Double 

Dim NumberOfGemerationsSofar Ac Integer 
Dim CrossoverPoint As Integer 


Dim MoisyÜutput As Integer ' 1 = show lots of output: 
* 0 = don't 
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The general structure and plan for the program is simple’ Everything 
revolves around two arrays. 

First, the array CurrentGeneration holds the current generation of chro- 
mosomes, one chromosome per row. CurrentGeneration has rows running 
from 1 to PopulationSize, where PopulationSize is the number of indi- 
viduals or chromosomes maintained in each generation. CurrentGeneration 
has columna running from 0 to HuamberÜütDecisionVariables, where 
WumherüfDecisionVariables is the number of variables at play in the model 
for the GA runs. Column 0 of Currentüeneratian holds the ID of the cor- 
responding chromosome. 

Second, the array AbsoluteFitnezs holde the results of the fitness eval- 
uations for each chromosome in the current generation. AbsoluteFitness 
has rows running from 1 to PopulationSice and а row of AbsoluteFitness 
corresponds to a row of CurrantGeneration. AbsoluteFitness has columne 
from 1 to DutputSize, where QutputSize is the number of distinct values 
returned for а single chromosome by evaluation of the fitness function. Usu- 
ally, QGutputSize will equal 1, that 15, only 1 value is returned: the absolute 
fitness of the chromosome at hand. Sometimes, however, it is useful to have 
the fitness function return several values. H so, then their number is in- 
dicated by QutputSize and it is the responsibility of the fitness function, 
Sub EvaluateíI), to organize the response. By convention, column 1 of 
hhsoluteFitness must hold the absolute (ог raw) fitness of the chrome- 
some at hand. 

BasicGA works by initializing CurrentGaneration, calculating fitnesses 
with Evaluate(I) and thereby populating AbsoluteFitness. Then the next 
generation ls created.  Crossingover is performed, mutation is performed, 
and the сусје continues until the stopping condition (а count of the gener- 
ations im this code) is encountered. All this mostly happens through Sub 
RunGAUntilDona. 

Now some specific comments about these declarations. 














L The following parameter is-set in InitializeGa: 


(а) Сигтап ТО ш An integer, representing the ID number, ог count, 
of a given chromosome or solution. 
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2. The following parameters are set in GetGABunPars: 
(a) MumberüfGenerations. Integer, should be > 0, 
(b) PopulationSize. Integer, should be > 1. 
(с) Crassoverfate. Floating point, should be € [D, 1] 
(d) MutationRate. Floating point, should be € [0, 1]. 
(e) HeetWSaved. Integer, should ће > 0. 
3. The following parameters are set in GetGAModelRüunParsi- 
(а) NKumberüfDacisionVariables. Integer, should be > 1. This ia the 
number of input variables sent to the fitness evaluation function. 
(b) QutputSize. Integer, should be > 1. This is the number of output 
values returned by the fitness evaluation function 


4 The following parameters are set in ReDimGAArrays: 





(a) CurrentGeneration. Declared here as nonstatic, Le., 
Dim CurrentGanaration() Де Double. 

(Ы) AbsoluteFitness. Declared here as nonstatic, Le., 
Dim AbsoluteFitness(] As Double. 

(с) RelativeFitness. Declared here аз nonstatic, i.e. 
Dim RelativeFitness() Ав Double. 

(4) BestNCurrentSaveSet. Declared here 85 nonstatic, Le, 
Dim BestlCurrentSaveSaet() As Double. 


B.3 Sub DoTheGA: Code Structure Overview 


Sub Потвеба is the intended entry point to this program. lts structure is 
quite simple and the source code is given in Figure B.1, 


" 22¢¢¢ee¢¢e¢ee¢e>% o Main Program оживе 
Sub DoTheGA () 
Randomize (17) 
ChPir “c:\clasave\" 
NoisyÜutput = 1 
‘|. Make preparations to run the GA. 
PrepareGa 
* 2. Run the GA until the stopping condition is met 
RunGAUntilDone 
' 3. Postpare the system 
Postpareuüs 
End Bub 
Figure B.1: Sub DoTheGA Source Code: Main Entry Point 
A few comments are in order. The purpose of Randomize (17) is to 
initialize the random number generator. This guarantees that on each run 
the same sequence of random numbers will be generated, regardless of which 
machine the program is run оп. 
ChDir "c:\clasave'," 


is far the IBM PC (MS DOS) environment and will need to be changed or 
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commented out on the Macintosh. И assumes that a directory called с1авауе 
exists on the C drive. The program writes its output files to this directory, 
NoisyDutput isset to 1, turning on various comments during the running 
of the program. Set it to Ü to turn these off. 
Now, briefly, to the three subroutines called in Sub DoTheGA. 
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B.3.1  PrepareGA 


The purpose of this subroutine is to initialize the program and to generate 
the first generation of chromosomes. The source code for this subroutine је 
given in Figure B.2. 


Sub PrepareGa () 

' 1. initialize the system 
InitializecA 

' 2. Validate tha input data 
ValidateGAInput 

' 3. Generate the initial population of chromosomes 
MakeGAGentne 


"4. Calculate the absolute and relative 
'fitnasses for sach chromosome. 


CalculateFitness 
' 5b. Initialize tha save sets 


InitializeSaveS5ets 
End Sub 


Figure B.2: Sub PrepareGA Source Code 
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B.3.2 RunGAUntilDone 


This is the subroutine that does the main work in the program. Ite sourne 
code is given in Figure B.3. 


Sub RunGAUntilDone () 
Do Until NumberüfGenerationsSoFar >= HumherüfGasnsrations 
If (MoisyÜutput = 1) Then 
i MainForm. ProgressBox.Text = 
ss» ' “NumberfifGenerationsSoFar = " E NumberDfGanerationsSofar 
End If 
^ Now to the main business: 


PartornCrassover 
PerfoürmMutation 
CalculatePitnuess 
ÜpdatelheSaveSets 
SGortBestNCurrentSaveSat 
HumberüÜfGenerationsSoFar = NumberüfGenarationsSoFar = 1 
Loop 
If (NoisyDutput = 1) Then 
' MainFarm.ProgressBox.Text = 
==> ' "NumberüfGenerationsSoFar = " k NumberDÜfGenerationsSoFar 
End I£ 
End Sub 


Figure B.3: Sub RunGAUntilDone Source Code. Note: | 
ken with my continuation symbol: ==>, 
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B.3.3 PostpareGA 

Sub PostpareGA cleans things up once the GA has run its course. 
program does two things: writes out CurrentGeneration to а file and writec 
out BastHCurrentSaveSet (the array holding the best N chromosomes found 
to this point in the GA ruri) to а file. The source code is given in Figure ВА. 


Sub PostparecA () 


' Print out final generation. 
PrintzFileCurGen 

' Print out the best finds overall, 
PrintZ2FilsBestÜverall 


Ii (NHoisyÜutput = 1) Then 
| MainForm.ProgressBox.Text = "All done." 


End I£ 


Figure B.4: Sub РовтрагабА Source Code 
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В.4 PrepareGA: Detailed Code Structure 


B.4.1  InitializeGA 


InitializeGA initializes CurrantIDNum to 0, then calls three subrotutines 
The first, GetGARunPars, ls for obtaining Information needed to make this: 
run of the GA. The second, GetGAModelRunPars, is for obtaining particular 
information about the model (fitness function) that ie to be applied in this 
particular run of the СА. 

The third, ReDimGAArrays, only has the function of setting the sizes of 
various dynamic arrays [see declarations. section, above}. 


1. CurrentGeneration{1 To PopulationBize, 
D to HMumberüflecisionVariables) As Double. 


2. AbsoluteFitness(1 To PopulationSizs, 
1 To ÜutputSize) Ав Double. 


3. RelativeFitnessii To PopulationSize) As Double. 
4. BestNCurrentSaveSet(1 To BestNSaved + PopulationSize, 
1 To HumberÜüfDeciBionVariableB + 1 + GutputSize) As Double 
GetGARunPars 
The following program variables need to be initialized in this subroutine: 
1. NumberOfGenerations. 
. PopulationSize. 
‚ Croasoverhata. 


. MutatinnBate, 


. BastliSaved. 
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GetGAModelHunPars 
The following program variables need to be initialized in this subroutine: 
1. NumberüfDecisionVariables. 


2, DutputSize, 

In addition the following array must be initialized 
1. DecisionVariablelnfo. 

Specifically, 


ReDim DecisionVariahleInfoil ta _ 
NumberÜfDecisionVaríables, 1 to 4) As Double 


should be declared and DacisionVariableInro initialized. 

in DecisionVariableinfo each row corresponds to а decision variable. 
Column 1 holds the Low Value, column 2 the High Value for the now's variable. 
Column 3 is 0 if the variable is not required to be an integer, and 1 otherwise. 
Finally, column 4 holds grid search information. (BasictzA does not have any 
grid search capability, but is designed to be expanded.) A 1 indicates that no 
grid search is being done on that variable. A number larger than 1 indicates 
that if в grid search is to be done, then the number representa the number 
of grid points to be examined for that variable. The array holds floating 
point numbers, and grid search counts are integers. It is up to the grid 
search program to make the conversion. By convention, we truncate, eg.. 
3.1 stored goes to 3. 






B.4.2 ValidateGAInput 


The purpose of this subroutine ia to validate the information collected in the 
InitializeGA subroutine. In the current version of the software. little or 
nothing is done here, Beware! 
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H.4.3 MakeGACGenOne 


Declare: ReDim CurrentGeneration(1 to PopulationSize, 0 to 
WumberÜfDecisionVariahles) As Double. Each row holds а chromosome 
of the current generation. Columns | through WumberffDecisionVariables 
hold values for the corresponding decisión 
variables. Column 0 holds the ID number of the solution. 

This subroutine is very simple, Tt merely uses DezisionVariableInfo to 
load up CurrentGeneration, with the aid of а random number generator. 
Also; each member of the generation (Le., each row) is given a unique ID. 


B.4.4 CalculateFitness 


This toutine calls Evaluate(I) for each member (row) of 
CurrentGeneration, Evsluate(t) then calculates the fitness of that row 
and stores it in AbsolutaFitness. By convention, the first column of 
AbsoluteFitness is the absolote htness of the corresponding row. or solu- 
tion. H the fitness function, Evaluate(I), returns more than one value, 
additional values are stored in the second, third, and s оп columns of 
AbsoluteFitnesns. 

Following this, CalculateRalativeFitness is called, which calulates the 
relative fitnesses from the absalute fitnesses and stores them in 
RelativeFitness, & one-dimensional array. 


B.4.5 InitializeSaveSets 


In the basic program, only one save set та used. BestllCurrentSaveSet stores 
the best № solutions so far, plus the current generation. In this subroutine, 
CurrentGeneration and AbsoluteFitness are read into 
BestNCurrentSaveSet, which is then sorted on absolute fitness in the sub- 
routine SortBastliCurrantSaveSet. 
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B.5 Sub RunGAUnitlDone: Detailed Code 
Structure 


As is clear from the code for Sub BunGAUntilDone (Figure B.3 and &B.7) 
this procedure has ive main subroutine calls: We now briefly describe each 
and refer the reader to the complete code listing in $8.7 


B.5.1 PerformCrossover 


This is the most complex of the five subroutines, but the basic idea is simple. 
Using fitness proportional section, two chromosomes are randomly drawn 
from CurrentGeneration. ID crossover is deren via a random number, then 
the two chromosomes are crossed over and the results read into the holding 
array, ChromosomeCopyspace. [f craesover is not drawn, then thé two chro- 
mosames are simply copied into ChromosomeCopyspace. This continues until 
PapulationSize is reached. at which time ChromosomeCopyspace is copied 
back into CurrantGeneration. 








B.5.2 PerformMutation 


In this subroutine, the program loops through the entire array 

CurrentGeneration. For each entry a random number is drawn to determine 

whether there shall be а mutation. If there is to be а mutation, a uniform 

random number is drawn between the declared high and low values for the 
парол variable in question. 





B.5.3 CalculateFitness 

This routine calls the eub Evaluate which is а model-specific procedure that 
calculates the values for a row of the array AbsoluteFitness. 

B.5.4 UpdateTheSaveSets 


Only one save set is present in the program: BestNCurrentEaveSet. The 
program reads CurrentGeneration into columns 0 through 
HumherüfDecisionVariables and AbsoluteFitness into the remaining 
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higher-order columns, all this beginning at line BestSaved + 1. This has 
the effect of writing over the worst rows of SestNCurrentSaveSet, leaving 
the best BestNSaved rows intact. The program then [next sub) sorts 
BestlCurrentSaveSet on absolute fitness. 





B.5.5 SortBestNCurrentSaveSet 


The program uses а simple bubble sort on column 
NumberDibecisionVariables + 1 of BastNCurrentSaveSet. This column 
is presumed to hold the absolute fitnesses of the various rows. 





B.6 Sub PostpareGA: Detailed Code 
Structure 


Sub PostpareGA calle two subroutines in order to write to files the current 
generation and the overall best N (= BestHSaved) chromosomes found during 
the run of the GA. Source code for these two subroutines is given in Figures 
B.5 and B.6. 


B.7 Complete Code Listing 


There follows the complete listing of the code. Following the declarations 
section, the procedures, whether subs or functions, are in alphabetical order. 

Note For purposes of fitting the listing оп the typeset page, 1 have 
occasionally broken lines. When | do this, I use the continuation symbol, 
=>, whith is not part of Visual Basic. 
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Sub PrintZFileBestüverall () 

Dim I, J As Integer 

Dim FHameBestÜverall, FiumBestOverall 
Dim msg 


FHumBestüverall = FreaFile 
fHameBestÜverall = 
=> "B" & HumberüfGenerationsSoFar Ë 
==> "F" p. FNumHestüOverall k ".ТАТ" 
Open FlameBestOverall For Output As FNumBestüverall 
For T 1 Та bestNlSaved 
$ o. Н 
Far J = 0 To MumberüfDecisionVariables + OutputSize 
msg = msg & " " Ë BestNCurrentSaveSet(I, J) 


Next J 

Print BFNumBéstÜüverall, msg 
Naxt I 
Close 
End Sub 


Figure B.5: Sub Print2FileBestOverall: Source Code. Note: My contin- 
:yribol, not in Visual Basle: ==>. 
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Sub Print2FileCurten () 
Dim I, 3 As Integer 
Dim FHameCG. FHumCG 
[im msg 


FNumCGo = FreeFile 
FlameCG = "C" E NumberiiGenerationsioFer & 
==> "G" È FNumCG k "TAT" 
Open FNameCG For Üutput As Fhumco 
For i = 1 To PopulationSize 
meg = ji 
For J = à To WumberUfDecisionVariables 
msg = msg k " " k CurrentGeneration(I, J) 
Next J 
For J = 1 To QutputSize 
msg = msg È " " k AbsoluteFitnese{1, J) 
Next J 
шер = msg k " " k RelativeFitness(1) 
Print &FNumCG, msg 
Нехт 1 
Close. 


End Sub 


Figure 8,6: Sub PrintZFileCurGen Source Code. 
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* ной 951127: This is in file: GACODE.BAS in 

' the folder clapopo3 

' Аш freezing this for now. Call it the BasicGA program, 
" version 95112/. 


Dption Explicit 


ee tutta eet ee eee ДЕ ee ee ee 
'-akdcbedkékase*ks Helon, constants declared that *** 
"жж should be read in, Режи 


sok 951126; Note; These program variables, 

' in.a non-stubbad 

environment, tased to be declared in the declarations 
section. [hey are so declared, but I have commanted 
' eut the declarations (see below). 


"ра af fae of of he b dom e 
Ge fram GetugARunPars +++4+4+++4+4+++ 


Const WNumberDfüGenerations = 20 
'Getllunberüfüenerations 

Const PopulationSize = 100 
'GetPopulationSize 

Const CrossoverHate = „77 
'GaetCrassoverBRate 

Const MutationBate = .23 
'GatMutationRate 

Const beasatli5aved = 100 

"бесвавтИбатай 

i + m += dad | РЕГЕ 

l JHH from GetMedelhunPars a ex 
Const NumberÜfDecisionVariablaes = 4 


'GetiutputsSize 
Const OutputSize = 2 
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Позне from/for  InitDVarInfa/StubInitDVarInfo +++ 


Dim DecisionVariableInfo(i To NumberDfDecisibpnVariables, 
=> 1 To 4) As Double 


Const DecisionVariablelnfol! = Б ‘r, low 

Const DecisionVariableInfoi2 = 20 ' r, high 

Const DecisionVariableInfol3 = Q * г, not integer 
Const DecisionVariableInfolà = (| * r, no grid search 
Const DecisionVariableInfo21 = 10 'v, low 

Const DecislonVariableInfo22 = 30 ' v, high 

Сопат DecisionVariableInfo223 = 0 ' v, not integer 
Const DecisionVariablelnfo24 = 0 * v, по grid search 
Const DecisionVariablelnfo3l = 15 "u, low 

Const DecisionVariableinfo32 = 25 * u, high 

Const DecisionVariableInfo33 = 0 ' u, not integer 
Const DecisionVariableInfo34 = Q ' u, no grid search 
Const DecisionVariablelnfodi = 200 '1, iow 

Const DecisionVariableInfo42 = 300 ^" 1, high 

Const DecisionVariablelnfo43 = 0 ° 1, not integer 
Const DecisionVariableInfo44 = Q * 1, no grid search 


! o4 ome em bob bm Rb m b GR ttt "++ ++ as e de 


oe mo oof ки ак ава am mo a me ee mee a ae oe а cake eon 
"тенк Above, Constante declared that =>: 
"уревавуфики»ккж ghould be read in. *¢*et#e+e¢ 002% 


! Global variables 


ne 


from GatüÁRunPars 


жане but explicitly daclared above +++4+++4++4+++ 
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‘Dim Humherüfüenerations as Integer 
‘Tin PopulationSize as Integer 
‘Dim CrosgoverRate Ав Double 

"Dim MutationHate Ав Doubla 

‘Dim bestNSaved as Integer 


t pee from DatModelHunParg 
i HHH a explicitly declared abova ++++++++++4 


Dim MumberüfDecisionVariahles As Integer 
Dim OutputSize As Integer 


! —+———J kr+ 





Dim Index As Integer 

Global CurrentGeneration() As Double 
Global AbsoluteFitness() As Double 
Dim ChromosomeCopySpace() As Double 
Dim RelativeFitness() As Double 

Dim Crossuverlikeiihood() As Double 
Dim BestNCurrentSaveSet() As Double 
Dim LowestAbsoluteFitnese As Double 
Dim HighestAbsoluteFitness As Double 
Dim Currentidiium As Double 

Dim BumberÜfGoenerationsSoFar As Integer 
Dim CrossoverPoint As Integer 


Dim NoisyDutput As Integer ' 1 = show lots of output; 
' D = don't 

Sub CalculateFitness () 

Dim I As Integer 


For I = 1 To Populationsize 
Evaluate (I) 

Wext І 

CalculateRelativeFitness 
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End Sub 


Sub CalculateHelativeFitness Í) 

Dim i As integer 

fim Interval, LowestAbsoluteFitness, 

=> HighestAbaoluteritness As Double 


LowestAbsoiuteFitness = FindLovest() 

HighestAbsoluteFitness = FindHighaest() 

Interval = HighestAbsoluteFitness - LowestAbsoluteFitneas 

If HighestAbsoluteFitness < LowestAbsoluteFitness Than 
MasgBox “Whoa! in CalculateBelativeFitnes£s, 


==> HighestAbsoluteFitness = " k HighestAbsoluteFitness Ek " and 


==> LowestAbsoluteFitness = " b LowestAbsoluteFitness 
End If 
For I = 1 To PopulationSize 
If Interval > .00000001 Then 
HelativeFitneas(1) = (AbsoluteFitness(l, 1) - 
==> LowestAbsoluteFitness] / Interval 


Else 
BelativeFitness(I) = 1 
End Tf 
Next I 
End Sub 


Sub CopyBtrings (Stringi, String2, Index) 
Dim 1 As Integer 


For I = Q To NumberUüfDecisionVariables 
ChromosomeCopySpace(Index, I) = 
==} Currentüeneration (Stringi, 1) 
| iomeCopySpace(Index + 1, 1) = 
==> бшгенде вне нды (быш; I) 
Wext I 
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End Sub 
Function Crossover () As Integer 
Dim ReturnValue As Integer 


If RandomülValue() <= CrassoverRate Then 
ReturnValue = 1 

Else 
Returnvalue = 9 

End 1 

Crossover = ReturnVaiue 

End Function 


Sub ÜrossoverStrings (Stringi, Stringz, Index) 
Dim 1 As Integer 


CrossoverPeint = Int((RendomOi¥alue() s 

==> (HumberüfDecisionVariables = 1)) + 1) 

14 CrossoverPoint >= NumberÜüfDecisionVariablea Then 
MsgBox "Whoa! In CrossoverStrings." 

End If 


' Copy up to the crossover point 
For I = 1 [0 CrosmoverPoint 
ChromosomeCopySpace(Index, 1) = 
=> CurrentGeneration(String1, I) 
ChramosomeCopySpace(Index + 1, I) = 
==> CurrentGeneration(String2, I) 
Next 1 


' Copy past the crossover point to the end 
For I = CrossoverPoint + 1 To NumberÜfDecisionVariablesa 
ChromosomeCopySpace(Index, I) = 
==> CurrentGeneration(Stringe, I) 
jasomeCopySpace(Index + 1, I) = 
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==> CurrentGeneration(Stringl, I} 
Next I 


' Assign neu IDa to the chromosomes 
СпготовопеСорубрасе Пијех, 0) = GetCurrentIDNum() 
ChromosomeCopyopace(Index + 1, 0) = GetCurrent и) 
End Sub 
| таажењвеиекикиики Main Program s««emawss 
Sub DoTheGA () 
Randomize (17) 
ChDir "<:\clasave\" 
Hoisylutput = 1 
'1l. Make preparations to run the GA. 
PrepareGA 
' 2. Run the GA until the stopping condition is met 
RunGAUnt: 1 Папе 
' 3. Postpare the system 
PostpareGA 
End Sub 


Sub Evaluate (I) 

' Note: This is a model-specific routine. 
' And should be revised, e.g. 

' pl goes to r 

Dim pi, p2, рз, på Ав Double 

pi = CurrentGeneration(I, 1) 
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р2 = CurrentGensration(I, 2) 
p3 = CurrentGeneration({I, 3) 
pl = CurrantGeneration(I, 4) 


AbsoluteFitnéss(I, 1) = 2 = pi * (1 + p2 / pa) / på 
AbsoluteFitness(I, 2) = 2 * pl * (1 + pQ / p3) / på 


End Sub 


Function FindHighest () Ав Double 
Dim I As Integer 
Dim Highest As Double 
Highest = AbgoluteFitness(1, 1) 
For I = 1 Te Рорціабіоцбіге 
If AbsoluteFitnessil, 1) > Highest Then 
==> Highest = AbsoluteFitness(I, 1) 
Next I 
FindHighest = Highest 
End Function 


Function FindLowest () As Double 

Dim. I As Integer 

Dim Lowest As Doubie 
Lowest = AbsoluteFitness(i, 1) 
For i = 1 To PopulationSize 

If AbsoluteFitness(I, 1) < Lowest Then 

==> Lowest = AbsoluteFitness(I, 1) 
Hext 1 
FindLowest = Lowest 

End Function 


Function GetburrentlDNum () As Double 
CurrestldHum = CurrentIdMum + 1 
GetCurrentiDHum = CurrentidHum 


End Function 
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Sub GetGARunPars () 
' This is a stub right now, with the program variables to be 
' initialized here declared as constants in the 
' declarations section. 
* But here they are: 
"Const NumberüfGenerations = 2 
"GethumberOitienerations 
"Const Populationsize = 50 
'GatPopulationSize 
'Const CrossoverRate = .7T 
"СегСговзотегВага 
'Congt MutationRate = 2% 
'GetMutarionRate 
'Const Bestilaved = 50 
"hatBestNCavad 
fumberOtbenerationssoFar 


|| 
e 


End Sub 


Sub GetModelRunPars () 

' This is a stub right nav, vith the program variables to be 
' initialized here declared as constants in the 

' declarations section. 

' But here they are: 


'NumberÜüfDecisionVariablas = á 

*GetNumberDfDecisionVariablea 
'ÜutputSize = 2 

'GetDutputSice 
StublInitDVarlInio 

"ог InitDbVarlInfo 

End Sub 
Sub InitializeGA () 
CurrantidHum = 0 
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GetGARunPars 
GatModelRunPars 
ReDimüÁArrays 


End Sub 


Sub InitializeSaveSets () 
Dim I, J As Integer 


' Number of rows is the number in the best H save seat 
' plus the population size 

' Number of columns is no. decision variables + ID + 
' absolute fitness 


' Read in CurrentGeneration array 
For [= 1 То Population2ize 
For J = 0 To MumberüfDecisionVariablegr 
BasthCurrentSaveSet({1, J) = CurrentGeneration(I, J) 
Next J 
Next l 
' Read in AbsoluteFitness array 
' Note: In the BesthCurrentSaveSet array 
' the absolute fitness is 
' Kept in column number MümnberüfDecisionVariables + 1. 
For L = 1 To Population5irze 
For J = 1 To ÜutputSize 
BestliCcurrentBSaveSst(I, HumberDÜfDecisBionVariables + 
==> J) = AbsoluteFitness(I, J) 


Next J 
Next I 
BortBestNCurraentSaveSet 
End Sub 


Sub MakeGAGenfne () 
Dim I, l Às Integer 
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Dim LowValue, HighValue Ав Double 


For I = 1 To PopulationSiza 
For J = 1 Tn HumberüfDecisionVariables 
LawValue = DerisionVariableInfo(J, 1) 
HighValue = DecisionVariableInfo(J, 2) 
CurrentGeneration(I, J) = 
== RandomBetween(LovYalus, HighValue) 
Нехђ J 
currentGeneration(]. 0) = GetCurrentiDfun() 
Haxt I 


End Sub 


Sub PerformÜrossover () 

Dim 1, J As Integer 

Dim Stringi, String? As Integer 
Dim BumFitnenses As Double 


SumFitnesses = 0 
For I = 1 To PopulationSize 
SumFitnesses = SumFitnesses + Relativeritness(I) 
Hext I 
' CrossoverLikelihead accumulates the probabilities of 
' CEIUSBHBOVEI. 50, 
' CrossoverLikelihced{PopulationSize) should = 1. 
CrossoverLikélihood(1) = RelativeFitness(i) / SumFitnesses 
For I = 2 To PopulationSirze 
CrossoverLikelihosd(l) = 
==> (RalativeFitness(I) / SumFitnessss) + 
==> CrossoverLikelihood{I — 1) 
Next I 
For І = 1 To PopulationSize Step 2 
String] = RandomStrings() 
' get a random string that can be crossed over 
String? = RandomStrings() 
' get а random string that can be crossed over 
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If Crossover{) = 1 Then 
! If we do crossover here, then 
Crossoverstrings Stringi, Stringz, I 
Elsa ' We don't do crossover and 


' ме just copy the chromosomes to the next generation. 
Call CopyString:(String], Stringz, T) 
End If 
Next 1 


' copy back into the CurrentGeneration array 
For I = 1 To PopulationSire 
For J = 0 To NuuberüfDecisionVariables 
CurrentGeneration(!I, J) = 
==>Гћгоповотеборубрасе (1, J) 
Next J 
Next I 
End Sub 


Sub PerformMutation () 
Dim I, J As Integer 


For I = 1: To PopulationSize 
For J = 1 To NumberÜfDecisionVariables 
If HandomfiValue() < MutationHate Then 
CurrentGensration(I, J) = 
==> RandomBetween(DecisionVariahleInfo(J, 1), 
==> DecisionVariablelnfo(J, 2)) 
CurrentGeneration(I, 0) = GetCurrentIDNumsí() 


End If 
Next J 
Next I 
End Sub 


Sub PostpareGA () 


' Print out final generation. 
PrintzFileCurüen 
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'-Print out the best finds overall. 
PrintzFileBestÜverall 


If (NoisyÜutput = 1) Then | 
MainForm.ProgressBox.Text = "Ali done." 

End Tf 

End Sub 

Sub PrepareGA () 

' 1. Initialize the system 
InitializeGA 

' 2. Validate the input data 
ValidateGAInnut 

' d. Generate the initial population of chromosomes 
MakeGAGanUne 


" а. Calculate the absolute and relative 
' fitnesses for each chromosome. 


CalculateFitness 
* Б. Initialize the вате gets 


InitializeSaveSets 


End Sub 


Sub Print2FileBestÜverall () 
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Dim I, J As integer 
Dim FNameBestÜverall, FHumBestÜüverall 
Dim meg 


FhumBastrUverall = Freesfile 
FNameBestÜverall = "B" k NumberOfGenerationsSoFar k 
==> "Е" è PNumBestDÜverall & "TKT" 
Open FHamaBestDverall For Dutput As FHumBegtDverall 
For I = 1 To bestWSaved 
mae mg uM 
For J = 0 To WumberDüiDecisionVariables + ÜutputSize 
msg = msg Ё “ " b BaestliCurrentBaveSert(I, J) 
Next J 
Print #FNumBestOverall, msg 
Next; 1 
glosa 


End Sub 


Sub Print2FileCurGen () 
Dim L, J As Integer 
Dim FNameCG, FNumCG 
Dim msg 


FWumCG = FreeFile 
FNameUG = "C" k HumberüfGenerationsSoFar è 
==> "G" & FNumCG & ". TAT" 
Open FNameCG For Output As FNumoG 
For 1 = 1 To PopulationSize 
msg = St 
For J = 0 To NumberOfDecisionVariables 
msg = msg k " " & Currentüeneration(I, J) 
Next J 
For J = 1 To Qutputsize 
msg = mg k " " k AbsoluteFitnass(I, J) 
Next l 
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шар = msg k " " k RelativeFitness(I) 
Print &FNumCG, msg 

Нех? I 

Close 


End Sub 


Function RandomOiValue () 

' Note: Here and only here we use the 0-1 
! random number generator built into Basic, 
RandomOlValue = Rod 

' return a random value from the interval [0,1] 


End Function 
Function RandomBetween (Low, High) 

RandomBetween = (RandomOiYValue() + (High - Low)) + Low 
End Function 


Function RandomStrings L) 

' The purpose of this routine is to pick 

' a chromosome to contribute to the next 

! generation. The likelihood of being picked 

' is proportional to the relative fitness of the 
' chromosoms 

Dim I As Integer 

Dim PointünUnitInterval Az Double 


PointünUnitlnterval = RandomDiValue() 


1 = 1 

While CrossoverLikelihood(I) < PointünUnitInteryal 
L= I+ 1 

Wend 


123 


RandomStrings = 1 
End Function 


Sub ReDimüGAArrays () 

ReDim CurrentGeneration(1 To PopulationSize, 
==> D To NumberOfDecicionVariables) As Double 

ReDim ChromosomeCopySpace(]l To PopulationSize, 
==> 0 To NumberDÜfDecisionVariables) As Double 

ReDim AhsoluteFitnuess(1 To Populationscize, 
==> 1 To DutputSize) As Double 

ReDim RelativeFitness(1 To PopulationSize) As Double 
ReDim BagtNCurrentSaveSet(1 To bestlisaved + 
==> PopulationSize, 0 To NumberDfDacisionVariahles + 
==> DutputSize) Ав Double 

пар CrossoverLikelihood(! То Populationsize) 

==> As Double 

End Sub 


Sub AunGAlintilDene (2 
Do Until MNumberDfGenerationsSoFar >= NumberOiGenerations 
If (NoisyQutput = 1) Then 
MainForm.ProgressBox.Text = 
==> "NHumberÜfGenerationsSoFar = " 
==> k HumberüfüGenerationsSoFar 
End If 
' How to the main business: 


ParformOrossover 

ParformMutation 

CalculateFitness: 

UpdateTheSaveSets 

SortBestHCurrentSavaeSet 

HumberüfGanserationsSoFar = NumberüfGenerstionsSoFar + 1 
Loop 
|». I (NoisyDutput = 1) Then 

MainForm.ProgressBox.Text = 
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za» "NumberlüfüGenserationsSoFar = " 
==> k HumberüfGenerationsSoFar 
End If 


End Bub 
Sub SortBestlNCurrentSaveSaet 1) 


Dim CurrentRov, I As Integer 
Dim ArraySize As Integer 
Dim Sortindex As Integer 
Dim NumberSwapped Ав Long 


NumberSvapped = -1 
ArraySize = bestNSaved + PopulationSize 
Sortindex = NumherüfDecisionVariables + 1 

' Above: note that in InitializeBAveSets that 
j the absolute fitness is read 

' into column  MumberÜfDecisionVariables + i 
While NumberSwapped <> Q 

HumberSvapped. = 0 


For CurrentRow = 1 To ArraySize 
1 = CurrentRow 
While I <= ArraySize 
Ii BestHCurrentSaveSet(CurrentRow, Sortindex) < 
==> BestHCurrentSaveSet(I, SortIndex) Then 
SwapRows CurrentRow, I 
NumberSwapped = NumberSvapped + 1 


End 1f 
l-I*1 
капа 
Next CurrentRow 
Wend 
End Sub 
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Sub StubInitDVarInfo () 


Dim 1, J As Integer 


* Eoad up the array DecisionVariableInfo 


'For I = £ To NumbarüfDecisionVariables 


i For J = 1 To 4 


+ Haxt J 
"Чек I 


DacisionVariablelnfo(i, 


'r, low 


DecisionVariahleInfo(1, 2 


' r, high 
DaecisionVariableInfo(l, 
' r, not integer 
DecisionVariablelnfa(l, 
' f, no grid search 


DaBcisnionVariableinfo(2, ! 


'w, low 
DacistionVariableInfo(2, 
' v, bigh 
DacisionVarisbleInfot2, 
' v, not integer 
DecisionVariablelnfotz, 
' v, no grid search 


DacisionVariableinfo(3, 
"uw, low 
DecisionVariahleInfal3, 
^ u, high 
DecisionVariableInfa(3, 
'u, not integer 


DecisionVariableInfo(3, | 


' u, no grid search 


= 


DecisionVariabielnfoil 
baecistonVariableinioiZ2 
DecisionVariablelnfolii 


DecisionVarisblelnfal4 


DecisionVariableinfo2l 
DecisionVariableInfo22 
DecisionV&riableInfo23 


ÜecisionVariableinfo24 


DacisionVariablelInfo31i 
DacisionVariablelnfo32 
DecisionVariableInfo33 


DaecimionVariableInfo34 
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DecisionVariableInfol(4, 1) = DacisionVariableinfo4l 
"E, low 
DecisionVariableInfo(4, 2) = DacisionVariablelnfo42 
' 1, high 
DecisionVariableInfo(4, 3) = DecisionVariableInfo43 
' 1, nat integer | 
DacisionVariableInfo(4, 4) = DecisionVariablaInfo44 
' 1l, uo grid search 


End Sub 
Sub SwapRows (Indexl, Index2) 


Dim I Аз Integer 

Dim Templi) As Double 

RáDim Temp(O Ta NumberüfDacisionVariables + 
==> Dutputsize?; As Double 


For I = Ü To NumberüfDecisionVariables + Outputsize 
Temp(I) = BestWCurrentSaveSet(Indexi, I) 
Next I 


For I = б To NHumberÜüfDecisionVariables + UutputSize 
BastHCurrentSaveBet(Index!, I) = 
==> BestNCurrentSaveSet(Indsx2, I) 
Next I 





For I = 0 To NumberDfDecisionVariahles + Üutputsiza 
BestNCurrentSaveSet(Index2, 1) = Tempil) 
Next I 
End Sub 
Sub UpdataTheSaveSets () 
Dim I, J As Integer 
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' Basically, this subroutine dumps the 

' CurrentGeneration array 

' and the AbsoluteFitmess array into the 

' BesthCurrentsaveSet array, by appending them after the 

' current beatWSaved. Later, we sort the entire arrary. 

' This is done as the next subroutine call in RunüGAUntilDane. 


"Aight now only the best H overall save set 
' Humber of rove is the number in the save set plus 
' tha population size 
' Number of columns 12 no. decisibn variables + 
' ID + absolute fitness 
For I = 1 + besttSaved To PopulationSize + besthSaved 
ror J = 0 To MumberffPecisionVariables 
BestiGurrentSavesetil, J) = 
==> CurrentGeneration(l - bestHSsved, J) 
Next J 
For J = 1 To OutputSize 
BestHCurrentSaveSet(l, 
==> NumberDüfDecisionVariables * J) = 
==>: AbsoluteFitnessil — bestNsaved, J) 
Next J 
Next 1 
End Sub 


Sub ValidateGAInput () 
' Note: There's a lot more that needs doing here. 


If WümberOfGenserartionsSoFar > NumberüfGenerations Then 

MsgBox "NumberÜfGensrationsSoFar > NumberOfGenerations " & 
==> HumberüfüenerationsSoFar $ " " & NumberüfGenerations 

End If 


End Sub 
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TWO 
BIASES 


The discussion of heuristics in Chapter 1 suggested thal individuals develop 
rules of thumb to reduce the information-processing demands of decision 
making. These rules of thumb provide managers wilh efficient ways of dealing 
with complex problems that produce good decisions a significant proportion of 
the time. However, heuristics also lead managers to systematically biased 
oulcomes. A cognilive bias (or simply bias throughout this book) refers to 
situations in which a heuristic is inappropriately applied by an individual in 
reaching a decision. 

This chapter is written to provide you with the opportunity to audit your own 
decision making and identify the biases that affect you. A number of problems 
are presented that allow you to examine your problem solving and learn how 


your judgments compare to the judgments of others. The quiz items are then 


used to illustrate 13 predictable biases to which managers are prone, and that 
frequently lead to judgments that systematically deviate from rationality. 
To start out, consider the following two problems: 


Problem 1: The following 10 corporations were ranked by Fortune 
magazine to be among the 500 largest United States—based firms accord- 
ing to sales volume for 1987: 
- Group A: Gillette, Coca-Cola Enterprises, Lever Brothers, Apple Comput- 
ers, Hershey Foods 
Group 8: Coastal, Weyerhaeuser, Northrup, CPC International, Champion 
International 
Which group of five organizations listed (A or B) had the larger total sales 
volume? 


Problem 2: (Adapted from Kahneman and Tversky, 1973) 


The best student in my introductory MBA class this past semester writes . 


poetry and is rather shy and small in stature. What was the student's under- 
graduate major: 

(A) Chinese studies or 

(B) Psychology? 


Ри 11 
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What are your answers? If you answered A for each of the two problems, you 
may gain comfort in knowing that the majority of respondents choose A. If you 
answered B, you are part of the minority. In this case, however, the minority 
represents the correct response. All corporations in group B were ranked in the 
Fortune 100, while none of the corporations in group A had sales as large. In 
fact, the total sales for group B was more than double the total sales for group 
A. In the second problem, the student was actually a psychology major, but 
more important, selecting psychology as the student's major represents a more 
rational response given the limited information. 

Problem 1 illustrates the availability heuristic discussed in Chapter 1. In this 
problem, group A contains consumer firms, while group B consists of industrial 
lirms and holding companies. Most of us are more familiar with consumer firms 
than conglomerates and can more easily generate information in our minds 
about their size. If we were aware of our bias resulting from the availability 
heuristic, we would recognize our differential exposure to this information and 
adjust, or at least question, our judgments accordingly. | 

Problem 2 illustrates the representativeness heuristic. The reader who re- 
sponds “Chinese studies" has probably overlooked relevant base-rate infor- 
mation—namely, the likely ratio of Chinese studies majors to psychology ma- 
jors within the MBA student population, When asked to reconsider the problem 
in this context, most people change their response to "psychology" in view of 
the relative scarcity of Chinese studies majors seeking MBAs. This example 
emphasizes that logical base-rate reasoning is often overwhelmed by 
qualitative judgments drawn from available descriptive information. 

The purpose of problems 1 and 2 is to demonstrate how easily faulty conclu- , 
sions are drawn when we overrely on cognitive heuristics, In the remainder of 
this chapter, additional problems are presented to further increase your aware- 
ness of the impact of heuristics on your decisions and to help you develop an 
appreciation for the systematic errors that emanate from overdependence on 

them. The thirteen biases examined in this chapter are relevant to virtually all 
individuals. Each of the biases is related to at least one of the three judgmental 
heuristics introduced in Chapter 1, and an effort has been made to categorize 
them accordingly. However, it is important to remember that the way our minds 
work in developing and using heuristics is not straightforward. Often our 
heuristics work in tandem in approaching cognitive tasks. 

The goal of the chapter is to help you "unfreeze" your decision-making 
patterns and realize how easily heuristics become biases when improperly 
applied. By working on numerous problems that demonstrate the failures of 
these heuristics, you will become more aware of the biases in your decision 
making. By learning to spot these biases, you can improve the quality of your 
decisions. 

Before reading further, please take a few minutes to respond to the prob- 
lems outlined in Table 2.1. They will be used to illustrate the 13 decision biases 
presented in the remainder of this chapter. 


BIASES 13 


Table 2.1 Chapter Problems 


Respond to the following 11 problems before reading the chapter. 


Problem 3: Which is riskier: 
a. driving a car on a 400-mile trip? ож | 
b hang on a 400-mile commercial airline flight? 


-Are there more words in the English language 


a. that start with an r 
b. for which r is the third letter? 


Problem 4: 


university. He is very interested 


А LAURI stigious i 
Problem 5; Mark is finishing his MBA at a prestig sician. Is Mark more likely to 


in the arts and at one time considered a career as a mu 


take a job 
а. in the management of the arts? à 
b. with a management consulting firm? 
Problem 6: In 1986, two research groups sampled consumers on Ihe driving perfor- 


Plymouth Champ in a blind road test; that is, the 


mance of the Dodge Colt versus the Champ. As you may 


consumers did not know when they were driving the Colt or the 
know, these cars were identical; only the marketing varied. " " 
| h day for 60 days (а large num 
| A) sampled 66 consumers eac | | 
s 16 КЫША x beri and other variables), while the other research vini S 
MA. od 22 consumers each day for 50 days. Which consumer group he P COR 
eye i which 60 percent or more of the consumers tested preferred the g : 


a. Group A? 
b. Group B? "i 
| | -region sales director for the fifth time 
: You are about to hire a new central-region | 
бобам Ga predict that the next director should work out reasonably pa oe үе 
last tos were “lemons,” and the odds favor hiring at least one good sales direc 


tries. This thinking is 

a. Correct. 

b. Incorrect. 

You are the sales forecaster for a department store chain with nine loca- 


tions. The chain depends on you for quality projections of future sales a do asa 

decisions on stalfing, advertising, inlormation system developments, vibe по, me 
i d the like. All stores are similar in size and merchandise se | ; Hees 

qa in their sales occurs because ol location and random fluctuations. Sa 


1989 were as follows: 


Problem 8: 


Store 1989 1997 

1 $12,000,000 | ЖИНИНЕ 
2 11,500,000 MEE 
3 11,000,000 TERCER 
4 10,500,000 RATES 
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Table 2.1 (Continued) 


5 10,000,000 
6 9,500,000 
7 9,000,000 
8 8,500,000 oe 
9 8,000,000 түүт 


TOTAL $90,000,000 $99,000,000 


Your economic forecasting service has convinced you that the best estimate of total 
sales increases between 1989 and 1991 is 10 percent (to $99,000,000). Your task is to 
predict 1991 sales for each store. Since your manager believes strongly in the econom- 
ic forecasting service, it is imperative that your total sales equal $99,000,000, 


Problem 9: Linda is 31 years old, single, outspoken, and very bright. She majored in 
philosophy. As a student, she was deeply concerned with issues of discimination and 
social justice, and she participated in antinuclear demonstrations. 


Rank order the following eight descriptions in terms of the probability (likelihood) that 
they describe Linda: ? 


—— а. Linda is a teacher іп an elementary school. 

— _ b. Linda works in a bookstore and takes yoga classes. 

— _ €. Linda is active in the feminist movement. 

— d. Linda is a psychiatric social worker. 

— 9, Linda is a member of the League of Women Voters. 

—— f. Linda is a bank teller. 

——. @. Linda is an insurance salesperson. 

—— h. Linda is a bank teller who is active in the feminist movement. 


Problem 10: A newly hired engineer for a computer firm in the Boston metropolitan 
area has four years of experience and good all-around qualifications. When asked to 
estimate the starling salary for this employee, my secretary (knowing very little about 
the profession or the industry) guessed an annual salary of $23,000. What is your 
estimate? 

$— per year. 
Problem 11: Which of the following appears most likely? 

Which appears second most likely? 


a. Drawing a red marble from a bag containing 50 percent red marbles 
and 50 percent white marbles. 

b. Drawing a red marble seven times in succession, with replacement 
(a selected marble is pul back in the bag before the next marble is 
selected), from a bag containing 90 percent red marbles and 10 
percent white marbles. 

c. Drawing at least one red marble in seven tries, with replacement, 
from a bag containing 10 percent red marbles and 90 percent while 
marbles. 


Problem 12: Listed below are 10 uncertain quaniilies. Do not look up any information 
on these items. For each, write down your best estimate of the quantity. Next, put a lower 
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Table 2.1 (Continued) | 
„ше ы RT NESS on NC ES aE. c Y aa aee OIA ua 
and upper bound around your estimate, such that you are 98 percent confident that your 
range surrounds the actual quantily. 


8. Mobil Oil's sales in 1987 

b. IBM's assets in 1987 

c. Chrysler's profit in 1987 

d. The number of U.S. industrial firms in 1987 with sales greater than 

those of Consolidated Papers 

The U.S. gross national product in 1945 

t. The amount of taxes collected by the U.S. Internal Revenue Service 
in 1970 

g. The length (in feet) of the Chesapeake Bay Bridge- Tunnel 

h. The area (in square miles) of Brazil 

I. The size of the black population of San Francisco in 1970 

J. The dollar value of Canadian exports of lumber in 1977 


Problem 13: (Adapted from Einhorn and Hogarth, 1978) 

it is claimed that when a particular analyst predicts a rise in the market, the market 
always rises. You are to check this claim. Examine the information available about the 
following four events (cards): 


i re 
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Card 1 Card 2 Card 3 Card 4 
Prediction: Prediction: Outcome: Outcome: 
Favorable Unfavorable Rise in the Fall in the 
report report markel markel 








= 


You currently see the predictions (cards 1 and 2) or outcomes (cards 3 and 4) associ- 
ated with four events. You are seeing one side of a card. On the other side of cards 1 
and 2 is the actual outcome, while on the other side of cards 3 and 4 is the prediction 
that the analyst made. Evidence about the claim is potentially available by lurning Over 
the card(s). Which cards would you turn over for the evidence that you need to check 


the analyst's claim? (Circle the appropriate cards.) 
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Bias 1—Ease of Recall (based 
upon vividness and recency) 
Problem 3: Which is riskier: 


a. driving a car on a 400-mile trip? 
b. flying on a 400-mile commercial airline flight? 
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Many people respond that flying in a commercial airliner is far riskier than 
driving a car. The media's tendency to sensationalize airplane crashes contrib- 
utes to this perception. In actuality, the safety record for flying is far better than 
that for driving. Thus, this example demonstrates that a particularly vivid event 
will systematically influence the probability assigned to that type of event by an 
individual in the future. This bias occurs because vivid events are more easily 
remembered and consequently are more available when making judgments. 

Consider another example. A buyer of women's wear for a leading depart- 
ment store is assessing her purchasing needs in footwear. To fill the demand 
for casual shoes, she needs to choose between a proven best-selling brand of 
running shoes and a newer line of boating shoes. The buyer recalis having 
seen a number of friends wearing boating shoes at a recent party and con- 
cludes that demand for boating shoes is increasing, She decides to order more 
boating shoes and reduce her order of the historically popular running shoes. 

In making this choice, the buyer has biased her ordering decision based 
upon limited data and the ease with which it came to mind. The buyer judged 
the demand for boating shoes by the availability of her recollection of a recent 
party. Under the influence of this bias, she will be consistently less likely to buy 
popular shoes worn by other groups with whom she tends not to socialize— 
even though aggregate demand for these alternative styles may be higher. 

Tversky and Kahneman (1974) argue that when an individual judges the 
frequency of an event by the availability of its instances, an event whose 
instances are more easily recalled will appear more numerous than an event of 
equal frequency whose instances are less easily recalled. They cite evidence 
of this bias in a lab study in which individuals were read lists of names of well- | 
known personalities of both sexes and asked to determine whether the lists 
contained the names of more men or women. Different lists were presented to 
two groups. One group received lists bearing the names of women who were 
relatively more famous than the listed men, but included more men's names 
overall. The other group received lists bearing the names of men who were 
relatively more famous than the listed women, but included more women's 
names overall. In each case, the subjects incorrectly guessed that the sex that 
had the more famous personalities was the more numerous. 

Many examples of this bias can be observed in the decisions made by 
managers in the workplace. The following came from the experience of one of 
my MBA students: As a purchasing agent, he had to select one of several 
possible suppliers. He chose the firm whose name was the most familiar to him. 
He later found out that the salience of the name resulted from recent adverse 
publicity concerning the firm's extortion of funds from client companies! 

Managers conducting performance appraisals often fall victim to the avail- 
ability heuristic. Working from memory, the vivid instances relating to an em- 
ployee that are more easily recalled from memory (either pro or con) will ap- 
pear more numerous and will therefore be weighted more heavily in the 
performance appraisal. Managers also give more weight to performance dur- 
ing the three months prior to the evaluation than to the previous nine months of 
the evaluation period. 
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Many consumers are annoyed by repeated exposure to the same advertis- 
ing message and often wonder why the advertiser doesn't give more useful 
information, without repeating it so many times. After all, we are smart enough 


to understand it the first time! Unfortunately, both the frequency and the vivid- 


ness of the message have been shown to affect our purchasing. This bombard- 
ment of repeated, uninformative messages makes the product more easily 
recalled from memory and is often the best way to get us to buy a product 
d Marmorstein, 1987). 
aes of our susceptibility to vividness and recency, Kahneman and 
Tversky suggest that we are particularly prone to overestimating unlikely 
events. For instance, if we actually witness a burning house, the impact on our 
assessment of the probability of such accidents Is probably greater than the 
impact of reading about a fire in the local newspaper. The direct observation of 
such an event makes it more salient to us. Similarly, Slovic and Fischhoft (1977) 
discuss the implications of the misuse of the availability heuristic on the per- 
ceived risks of nuclear power. They point out that any discussion of the poten- 
tial hazards, regardless of likelihood, will increase the memorability of those 
az and increase their perceived risks. 
ЊЕ Bock market vreiicibt some telling examples of the tendency to overreact 
to vivid and recent information in this way. After the April 1986 nuclear accident at 
Chernobyl in the Soviet Union, U.S. investors sold their nuclear stocks, which 
caused a dramatic fall in prices. Yet the real safety of the nuclear systems did not 
change dramatically as a result of the Chernobyl accident. Similarly, the stock of 
Union Carbide fell 30 percent within three weeks ofthe December 1984 tragedy at 
its chemical plant in Bhopal, India. Few investors stopped to realize that Union 
Carbide might reach an acceptable out-of-court settlement. It was more salient to 


imagine Union Carbide being hit with a devastating financial penalty. More. 


rational investors who bought the stock at its low point turned a hefty profit—even 
before the stock moved up higher on an unsuccessful takeover bid (Curran, 


1987). 


Blas 2—Retrievability (based 
upon memory structures) 


Problem 4: Are there more words in the English language 


a. that start with an r? 
b. for which г is the third letter? 


If you responded "start with an r," you have joined the majority. Unfortunately, 


this is again the incorrect answer. Kahneman and Tversky (1973) explain that - 


ically solve this problem by first recalling words that begin with r (like 
Pe па Wa da that have “ г аз the third letter (like bar). The relative difficulty 
of generating words in each of these two categories is then assessed. If we 
think of our mind as being organized like a dictionary, it is easier to find lots of 
words that start with.an r. The dictionary, and our minds, are less efficient at 
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finding words that follow a rule that is inconsistent with the organizing struc- 
ture—like words that have ап r as the third letter. Thus, words that start with a 
particular letter are more available from memory, even though most consonants 
are more common in the third position than in the first. 

Just as our tendency to alphabetize affects our vocabulary-search behavior, 
organizational modes affect information-search behavior within our work lives. 
We structure organizations to provide order, but this same structure can lead to 
confusion if the presumed order is not exactly as suggested. For example, 
many organizations have a management information systems (MIS) division 
that has generalized expertise in computer applications. Assume that you are a 
manager in a product division and need computer expertise. If that expertise 
exists within MIS, the organizational hierarchy will lead you to the correct re- 
source. If they lack the expertise in a specific application, but it exists else- 
where in the organization, the hierarchy is likely to bias the effectiveness of your 
search. ! am not arguing for the overthrow of organizational hierarchies; | am 
merely identifying the dysfunctional! role of hierarchies in potentially biasing 
search behavior. If we are aware of the potential bias, we need not be affected 
by this limitation. 

Retail store location is influenced by the way in which consumers search 
their minds when seeking a particular commodity. Why are multiple gas sta- 
tions at the same intersection? Why do "upscale" retailers want to be in the 
same mall? Why are the best bookstores in a city often all located within a 
couple blocks of each other? An important reason for this pattern is that con- 
sumers learn the "location" for a particular type of product or store and 


organize their minds accordingly. To maximize traffic, the retailer needs to be іп, 


the location that consumers associate with this type of product or store. 


Blas 3—Presumed Associations 


People frequently fall victim to the availability bias in their assessment of the 
likelihood of two events occurring together. For example, consider the following 
questions: 15 marijuana use related to delinquency? Are couples who get mar- 
ried under the age of 25 more likely to have bigger families? How would you 
respond if asked these questions? In assessing the marijuana question, most 
people typically remember several delinquent marijuana users and assume a 
correlation or not based upon the availability of this mental data. However, 
proper analysis would include recalling four groups of observations: marijuana 
users who are delinquents, marijuana users who are not delinquents, delin- 
quents who do not use marijuana, and nondelinquents who do not use mari- 
juana. The same analysis applies to the marriage question. Proper analysis 
would include four groups: couples who married young and have large fami- 
lies, couples who married young and have small families, couples who married 
older and have large families, and couples who married older and have small 
families. Indeed, there are always at least four separate situations to be consid- 
ered in assessing the association between two dichotomous events, but our 
everyday decision making commonly ignores this scientifically valid fact. 
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Chapman and Chapman (1967) have noted that when the probability of two 

events co-occurring is judged by the availability of perceived co-occurring 
instances in our minds, we usually assign an inappropriately high probability 
that the two events will co-occur again. Thus, if we know a lot of marijuana 
users who are delinquents, we assume that marijuana use is related to delin- 
quency. Similarly, if we know of a lot of couples who married young and have 
had large families, we assume that this trend is more prevalent than it may 
actually be. In testing for this bias, Chapman and Chapman provided subjects 
with information about hypothetical psychiatric patients. The information in- 
cluded a written clinical diagnosis of the "patient" and a drawing of a person 
made by the "patient." The subjects were asked to estimate the frequency with 
which each diagnosis (for example, suspiciousness ог paranoia) was accom- 
panied by various facial and body features in the drawings (for example, 
peculiar eyes). Throughout the study, subjects markedly overestimated the 
frequency of pairs commonly associated together by social lore. For example, 
diagnoses of suspiciousness were overwhelmingly associated with peculiar 
eyes. In addition, Chapman and Chapman found that conclusions, such as the 
just noted, were extremely resistant to change, even in the face of contradicto- 
ry information. Furthermore, the overwhelming impact of this bias toward pre- 
sumed associations prevented the subjects from detecting other relationships 
that were, in fact, present. 
Summary А lifetime of experience has led us to believe that, in general, more 
frequent events are recalled in our minds more easily than less frequent ones, 
and likely events are easier to recall than unlikely events. In response to this 
learning, we have developed the availability heuristic for estimating the like- 
lihood of events. In many instances, this simplifying heuristic leads to accurate, 
efficient judgments. However, as these first three biases (ease of recall, re- 
trievability, and presumed associations) indicate, the misuse of the availability 
heuristic can lead to systematic errors in managerial judgment. We too easily 
assume that our available recollections are truly representative of some larger 
pool of occurrences that exists outside our range of experience. 
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Blas 4—Insensitivity to Base Rates 


Problem 5: Mark is finishing his MBA at a prestigious university. He is very 
interested in the arts and at one time considered a career as a musician. 15 


Mark more likely to take a job 


a. in the management of the arts? 
b. with a management consulting firm? 


i i 1 | t people make this assess- 
How did you decide on your answer? How do mos | 
ment? ow should people make this assessment? Using the representa- 
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tiveness heuristic discussed in Chapter 1, most people approach this problem 
by analyzing the degree to which Mark is representative of their image of 
individuals who take jobs in each of the two areas. Consequently, they usually 
conclude "in the management of the arts." However, as we discussed in the 
first part of this chapter, this response overlooks relevant base-rate information. 
Reconsider the problem in light of the fact that a much larger number of MBAs 
take jobs in management consulting than in the management of the arts— 
relevant information that should enter into any reasonable prediction of Mark's 
career path. With this base-rate data, it is only reasonable to predict "manage- 
ment consulting." 

Judgmental biases of this type frequently occur when individuals cognitively 
ask the wrong question. If you answered "in the management of the arts," you 
were probably thinking in terms of the question "How likely is it that a person 
working in the management of the arts would fit Mark's description?" However, 
the problem necessitates the question "How likely is it that someone fitting 
Mark's description will choose arts management?" By itself, the representa- 
tiveness heuristic incorrectly leads to a similar answer to both questions, since 
this heuristic leads individuals to compare the resemblance of the personal 
description and the career path. However, when base-rate data is considered, 
it is irrelevant to the first question listed, but it is crucial to a reasonable predic- 
lion on the second question. While a large percentage of individuals in arts 
management may fit Mark's description, there are undoubtedly a larger abso- 
` lute number of management consultants fitting Mark's description because of 
the relative preponderance of MBAs in management consulting. 

An interesting finding of the research done by Kahneman and Tversky 
(1972, 1973) is that subjects do use base-rate data correctly when no other 
information is provided. For example, in the absence of a personal description 
of Mark in Problem 5, people will choose "management consulting" based on 
the past frequency of this career path for MBAs. Thus, people understand the 
relevance of base-rate information, but tend to disregard this data when de- 
scriptive data is also available. 


Blas 5—Insensitlvity to Sample Size 


Problem 6: In 1986, two research groups sampled consumers on the driv- 
ing performance of the Dodge Colt versus the Plymouth Champ in a blind 
road test; that is, the consumers did not know when they were driving the 
Colt or the Champ. As you may know, these cars were identical; only the 
marketing varied. 

One research group (A) sampled 66 consumers each day for 60 days (a 
large number of days to control for weather and other variables), while the 
other research group (B) sampled 22 consumers each day for 50 days. 
Which consumer group observed more days in which 60 percent or more of 
the consumers tested preferred the Dodge Colt: 
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a. group A? 
b. group B? 


Most individuals expect research group A to provide more 60-percent days for 


the Dodge Colt, because of the larger number of sample days—in other words, | 


there are 60 chances compared to 50. In contrast, simple statistics tells us that 
itis much more likely to observe more 60-percent days on daily samples of 22 
than on daily samples of 66, and the correct answer is group B. This is because 
a large sample is far less likely to stray from the expected 50-регсеп! prefer- 
ence split between the Dodge Colt and Plymouth Champ—since the cars are 
identical. (The interested reader can verify this fact with the use of an introduc- 
atistics book. 
ате the eaten of sample size is fundamental in statistics, Kahneman 
and Tversky (1974) note that it “is evidently not part of people's repertoire of 
intuitions" (p. 1126). Why is this? When responding to problems dealing with 
sampling, people often use the representativeness heuristic. In their minds, 
they ask the question, Which group is likely to have more days in which the 
results are skewed to 60 percent for the Dodge Colt instead of the expected 50 
percent? From there, the representative heuristic leads them to focus on the 
number of days as the pertinent variable for comparison. They then conclude 
that the group covering the greater number of total days will experience the 
greater number of total deviations. However, this analogy ignores the issue of 
sample size—which is critical to an accurate assessment of the problem. 
Tversky and Kahneman (1974) first discovered this bias toward ignoring the 
role of sample size, even when these data were emphasized in the formation of 
the problem, in testing the following research problem: 


A certain town is served by two hospitals. In the larger hospital about 45 babies are 
born each day, and in the smaller hospital about 15 babies are born each day. As you 
know, about 50 percent of all babies are boys. However, the exact percentage varies 
from day to day. Sometimes it may be higher than 50 percent, sometimes lower. 
For a period of one year, each hospital recorded the days on which more than 60 
percent of the babies born were boys. Which hospital do you think recorded more 


such days? 


The larger hospital? (21) 

The smaller hospital? (21) 

About the same? (53) 

(that is, within 5 percent of each other) 


The values in parentheses represent the number of individuals who chose 
each answer. As explained earlier, sampling theory tells us that the expected 
number of days on which more than 60 percent of the babies are boys is much 
greater in the small hospital, since a large sample is less likely to stray from the 
mean. However, most subjects judged the probability to be the same in each 
hospital, effectively ignoring sample size. 
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Consider the implications of this bias in advertising, where people trained in 
market research understand the need for a sizable sample, but employ this 
bias to the advantage of their clients. “Four out of five dentists surveyed recom- 
mend sugarless gum for their patients who chew gum." There is no mention of 
the number of dentists involved in the survey and the fact that without these 
data, the results of the survey are meaningless. If only 5 or 15 dentists were 

surveyed, the size of the sample would not be generalizable to the overall 
population of dentists. 


Blas 6—Misconceptlons of Chance 


Problem 7: You are about to hire a new central-region sales director for the 
fifth time this year. You predict that the next director should work out reason- 
ably well, since the last four were "lemons," and the odds favor hiring at least 
one good sales director in five tries. This thinking is ` 


a. correct. 
b. incorrect. 


Most people are comfortable with the foregoing logic, or at least have been 
guilty of using similar logic in the past. However, the performance of the first 
four sales directors will not directly affect the performance of the fifth sales 
director, and the logic in problem 7 is incorrect. Most individuals frequently rely 
upon their intuition and the representativeness heuristic and incorrectly con; 
clude that a poor performance is unlikely because the probability of getting five 
“lemons” in a row is extremely low. Unfortunately, this logic ignores the fact that 
we have already witnessed four "lemons" (an unlikely occurrence), and the 
performance of the fifth sales director is independent ol that of the first four. 

This question parallels Kahneman and Tversky's (1972) work in which they 
show that people expect that a sequence of random events will "Iook" random. 
They present evidence of this bias in their finding that subjects routinely judged 
the sequence of coin flips H-T-H-T-T-H to be more likely than H-H-H-T-T-T, 
which does not "appear" random, and more likely than the sequence H-H-H-H- 
T-H, which does not represent the equal likelihood of heads and tails. Simple 
statistics, of course, tell us that each of these sequences is equally likely 
because of the independence of multiple random events. 

Problem 7 moves beyond dealing with random events in recognizing our 
inappropriate tendency to assume that random and nonrandom events will 
"balance out." Will the fifth sales director work out well? Maybe. You might 
spend more time and money on selection, and the randomness of the hiring 
process may favor you this time. But your earlier failures in hiring sales direc- 
tors will not directly affect the performance of the new sáles director. 

The logic concerning misconceptions of chance provides a process expla- 
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nation of the gambler's fallacy. After holding bad cards on ten hands of poker, 
the poker player believes that he is due for a good hand. After winning $1,000 
in the Pennsylvania State Lottery, a woman changes her regular number— 
because after all, how likely is it that the same number will come up twice? 
Tversky and Kahneman (1974) note that "Chance is commonly viewed as a 
self-correcting process in which a deviation in one direction induces a devia- 
tion in the opposite direction to restore the equilibrium. In fact, deviations are 
not corrected as a chance process unfolds, they are merely diluted” 

In each of the preceding examples, individuals expected probabilities to 
even out. In some situations, our minds misconceptualize chance in exactly the 
opposite way. In sports (basketball specifically), we often think of a particular 
player as having a "hot hand" or "being on a good streak. lf your favorite 
player has hit his last four shots, is the probability of his making his next shot 
higher, lower, or the same as the probability of his making a shot without the 
preceding four hits? Most sports fans, sports commentators, and players be- 
lieve that the answer is "higher." In fact, there are many biological, emotional, 
and physical reasons that this answer could be correct. However, it is wrong! 
Gilovich, Vallone, and Tversky (1985) did an extensive analysis of the shooting 
of Philadelphia 76ers and Boston Celtics and found that immediately prior shot 
performance did not change the likelihood of success on the upcoming shot. 
Out of all of the findings in this book, this is the effect that my managerial 
students have had the hardest time believing. The reason is that we can all 
remember sequences of five hits in a row: streaks are part of our conception of 
chance in athletic competition. However, our minds do not categorize a string 
of “four in a row" as being a situation in which "he missed his fifth shot." As a 
result, we have a misconception of connectedness, when, in fact, chance (or 
the player's normal probability of success) is really in effect. 35.31 nay 

The belief in the hot hand is especially interesting because of its implication 
for how players play the game. Passing the ball to the player who is "hot" is 
commonly endorsed as a good strategy. It can also be expected that the 
opposing team will concentrate on guarding the hot player. Another player, 
who is less "hot" but is equally skilled, may have a belter chance of scoring. 
Thus the belief in the "hot hand" is not just erroneous, but could also be costly if 
you play professional basketball. | | 

Tversky and Kahneman's (1971) work shows that misconceptions of chance 


- are not limited to gamblers, sportsfans, or laypersons. Research psychologists 


also fall victim to the “law of small numbers.” They believe that sample avents 
should be far more representative of the population from which they were 
drawn than simple statistics would dictate. The researchers put too much faith 
in the results of initial samples and grossly overestimate the replicability of 
empirical findings. This suggests that the representativeness heuristic may be 
so well institutionalized in our decision processes that even scientific training 
and its emphasis on the proper use of statistics may not effectively eliminate its 
biasing influence. | 
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Blas 7—Regression to the Mean 


Problem 8: You are the sales forecaster for a department store chain with 
nine locations. The chain depends on you for quality projections of future 
sales in order to make decisions on staffing, advertising, information system 
developments, purchasing, renovation, and the like, All stores are similar in 
size and merchandise selection. The main difference in their sales occurs 
because of location and random fluctuations. Sales for 1989 were as 
follows: 


Store 1989 1997 
1 $12,000,000 $ 
2 11,500,000 
3 11,000,000 
4 10,500,000 
5 _ 10,000,000 
6 
7 
8 





9,500,000 
9,000,000 
8,500,000 
9 8,000,000 не Sant hs 
TOTAL $90,000,000 $99,000,000 


Your economic forecasting service has convinced you that the best estimate 
of total sales increases between 1989 and 1991 is 10 percent (to 
$99,000,000). Your task is to predict 1991 sales for each store. Since your 


ATI 





manager believes strongly in the economic forecasting service, it is imper- ' 


ative that your total sales are equal to $99,000,000. 


Think about the processes used to answer this problem. Consider the following 
logical pattern of thought: "The overall increase in sales is predicted to be 10 
percent ($99,000,000 — $90,000,000/$90,000,000). Lacking any other specific 
information on the stores, it makes sense to simply add 10 percent to each 
1989 sales figure to predict 1991 sales. This means that | predict sales of 
$13,200,000 for store 1, sales of $12,650,000 for store 2, and so on." This logic, 
in fact, is the most common approach in responding to this item. Unfortunately, 
this logic is faulty, ' 
Why was the logic presented faulty? Statistical analysis would dictate that 
we first assess the predicted relationship between 1989 and 1991 sales. This 
relationship, formally known as a correlation, can vary from total indepen- 
dence (that is, 1989 sales do not predict 1991 sales) to perfect correlation 
(1989 sales are a perfect predictor of 1991 sales). In the former case, the lack 
of a relationship between 1989 and 1991 sales would mean that 1989 sales 
would provide absolutely no information about 1991 sales, and your best esti- 
mates of 1991 sales would be equal to total sales divided by the number of 
stores ($99,000,000 divided by 9 equals $1 1,000,000). However, in the latter 
case of perfect predictability bétween 1989 and 1991 sales, our initial logic of 
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simply extrapolating from 1989 performance by adding 10 percent to each 
store's performance would be completely accurate. Obviously, 1989 sales are 
most likely to be partially predictive of 1991 sales—falling somewhere between 
independence and perfect correlation. Thus, the best prediction for store 1 
should lie between $11,000,000 and $13,200,000, depending upon how pre- 
dictive you think 1989 sales will be of 1991 sales. The key point is that in 
virtually all such predictions, you should expect the naive $13,200,000 esti- 
mate to regress toward the overall mean ($11,000,000). 

In a study of sales forecasting, Cox and Summers (1987) examined the 
judgments of professional retail buyers. They examined the sales data from 2 
department stores for 6 different apparel styles for a total of 12 different sales 
forecasts over a 2-week period. They found that sales between the 2 weeks 
regressed to the mean. However, the judgment of all 31 buyers from 5 different 
department stores failed to reflect the tendency for regression to the mean. AS 
a result, Cox and Summers argued that a sales-forecasting model that consid- 
ered regression to the mean could outperform the judgments of all 31 profes- 
sional buyers. 

Many effects regress to the mean. Brilliant students frequently have less 
successful siblings. Short parents tend to have taller children. Great rookies 
have mediocre second years (the "sophomore jinx"). Firms that have outstand- 
ing profits one year tend to have lesser performances the next year. In each 
case, individuals are often surprised when made aware of these predictable 
patterns of regression to the mean. | M | 

Why is the regression-to-the-mean concept, while statistically valid, coun- 
terintuitive? Kahneman and Tversky (1973) suggest that the representativeness 
heuristic accounts for this systematic bias in judgment. They argue that indi- 
viduals typically assume that future outcomes (for example, 1991 sales) will be 
maximally representative of past outcomes (1989 sales). Thus, we tend to 
naively develop predictions that are based upon the assumption of perfect 
correlation with past data. ' 

In some unusual situations, individuals do intuitively expect a regression-to- 
the-mean effect. In 1980, when George Brett batted .384, most people did not 
expect him to hit .384 the following year. When Wilt Chamberlain scored 100 
points in a single game, most people did not expect him to score 100 points in 
his next game. When a historically 3.0 student got a 4.0 one semester, her 
friends did not expect a repeat performance the following semester. When a 
real estate agent sold five houses in one month (an abnormally high perfor- 
mance), his co-agents did not expect similar performance in the following 
month. Why is regression to the mean more intuitive in these cases? Because 
the performance is so extreme that we know it cannot last. Thus, under very 
unusual circumstances, we expect performance to regress. However, we gen- 
erally do not recognize the regression effect in less extreme cases. | 

Consider Kahneman and Tversky's (1973) classic example in which the 
misconceptions surrounding regression led to overestimation of the effective- 
ness of punishment and the underestimation of the power of reward. Here, in a 
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discussion about flight training, experienced instructors noted that praise for 
an exceptionally smooth landing was typically followed by a poorer landing on 
the next try, while harsh criticism after a rough landing was usually followed by 
an improvement on the next try. The instructors concluded that verbal rewards 
were detrimental to learning, while verbal punishments were beneficial. Ob- 
viously, the tendency of performance to regress to the mean can account for 
the results; verbal feedback may have had absolutely no effect. However, to 
the extent that the instructors were prone to biased decision making, they were 
prone to reach the false conclusion that punishment is more effective than 
positive reinforcement in shaping behavior. 

How do managers respond when they do not acknowledge the regression 
principle? Consider an employee with very high performance in one perfor- 
mance period. He (and his boss) may inappropriately expect similar perfor- 
mance in the next period. What happens when his performance regresses 
toward the mean? He (and his boss) begin to make excuses for not meeting 
expectations. Obviously, they are likely to develop false explanations and may 
inappropriately plan their future efforts. 


Blas 8—The Conjunction Fallacy 


Problem 9: Linda is 31 years old, single, outspoken, and very bright. She 

majored in philosophy. As a student, she was deeply concerned with issues 

of discrimination and social justice, and she participated in antinuclear 

demonstrations. 
Rank order the following eight descriptions in terms of the probability ЧК 

lihood) that they describe Linda: 


. . а. Linda is a teacher in an elementary school. 

— b. Linda works in a bookstore and takes yoga classes. 

— C. Linda is active in the feminist movement. 

— а. Linda is a psychiatric social worker. . 

— 8. Linda is a member of the League of Women Voters. 

— f. Linda is a bank teller. 

—— а. Linda is an insurance salesperson. 

— h. Linda is a bank teller who is active in the feminist movement. 


Examine your rank orderings of descriptions C, F, and H. Most people rank 
order C as more likely than H and H as more likely than F. The reason for this 
ordering is that C-H-F is the order of the degree to which the descriptions are 
representalive of the short profile of Linda. The description of Linda was con- 
structed by Tversky and Kahneman to be representative of an active feminist 
and unrepresentative of a bank teller. Recall from the representativeness 
heuristic that people make judgments according to the degree to which a 
specific description corresponds to a broader category within their minds. 
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Linda's description is more representative of a feminist than of a feminist bank 
teller, and is more representative of a feminist bank teller than of a bank teller. 
Thus, the representativeness heuristic accurately predicts that most indivi- 
duals will rank order the items C-H-F. 

Although the representativeness heuristic accurately predicts how indi- 
viduals will respond, it also leads to another common, systematic distortion of 
human judgment—the conjunction fallacy (Tversky and Kahneman, 1983). 
This is illustrated by a reexamination of the potential descriptions of Linda. One 
of the simplest and most fundamental qualitative laws of probability is that a 
subset (for example, being a bank teller and a feminist) cannot be more likely 
than a larger set that completely includes the subset (e.g., being a bank teller). 
Statistically speaking, the broad set "Linda is a bank teller" must be rated at 
least as likely, if not more so, than the description "Linda is a bank teller and a 
feminist." After all, there is some chance (although it is small) that Linda is a 
bank teller but not a feminist. Based upon this logic, a rational assessment of 
the likelihoods of Linda being depicted by the eight descriptions must include 
a more likely rank for F than H. 

While simple statistics can demonstrate that a conjunction (a combination of 
two or more descriptors) cannot be more probable than any one of its descrip- 
tors, the conjunction fallacy predicts and demonstrates that a conjunction will 
be judged more probable than a single component descriptor when the con- 
junction appears more representative than the component descriptor. Intu- 
itively, thinking of Linda as a feminist bank teller "feels" more correct than 
thinking of her as only a bank teller. 

The conjunction fallacy can also operate based on greater availability of the 
conjunction than one of the unique descriptors (Yates and Carlson, 1986). That 
is, if the conjunction creates more intuitive matches with vivid events, acts, or 
people than a component of the conjunction, the conjunction is likely to be 
perceived falsely as more probable than the component. For example, Tversky 
and Kahneman (1983) found experts (in July 1982) to evaluate the probability 
of 


"a complete suspension of diplomatic relations between the USA and the Soviet 
Union, sometime in 1983" 


as less likely than the probability of 


"a Russian invasion of Poland, and a complete suspension of diplomatic relations 
between the USA and the Soviet Union, some time in 1983." 


As earlier demonstrated, suspension is necessarily more likely than invasion 
and suspension. However, a Russian invasion followed by a diplomatic crisis 
provides a more intuitively viable story than simply a diplomatic crisis. Similarly, 
in the domain of natural disasters, Kahneman and Tversky's subjects rated 


"a massive flood somewhere in North America in 1989, in which 1,000 people drown’ 


as less likely than the probability of 
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"ап earthquake in California sometime in 1989, causing a flood in which more than 
1,000 people drown." 


It is obvious that the latter possibility is a subset of the former, and many other 
events could cause the flood in North America. 

Tversky and Kahneman (1983) have shown that the conjunction fallacy is 
likely to lead to deviations from rationality in the judgments of sporting events, 
criminal behavior, international relations, and medical judgments. Our obvious 
concern with biased decision making resulting from the conjunction fallacy is 


that if we make systematic deviations from rationality in the prediction of future - 


outcomes, we will be less prepared for dealing with future events. 


Summary This discussion concludes our examination of the five biases (in- 
sensitivity to base rates, insensitivity to sample size, misconceptions of chance, 
regression to the mean, and the conjunction fallacy) that emanate from the use 
of the representativeness heuristic. Experience has taught us that the likelihood 
of a specific occurrence is related to the likelihood of a group of occurrences 
that that specific occurrence represents. Unfortunately, we tend to overuse this 
information in making decisions. The five biases we have just explored illustrate 
the systematic irrationalities that can occur in our judgments when we are not 
aware of this overreliance. | 


BIASES EMANATING FROM ANCHORING 
AND ADJUSTMENT 


Blas 9—1nsufficlent Anchor Adjustment 


Problem 10: A newly hired engineer for a computer firm in the Boston 
metropolitan area has four years of experience and good all-around qualifi- 
cations. When asked to estimate the starting salary for this employee, my 
secretary (knowing very little about the profession or the industry) guessed 
an annual salary of $23,000. What is your estimate? 

$  . per year. 


Was your answer affected by my secretary's response? Most people do not 
think that my secretary's response affected their response. However, indi- 
viduals are affected by the fairly irrelevant information contained in my secre- 
tary's estimate. Reconsider how you would have responded if my secretary had 
estimated $80,000. On average, individuals give higher salary estimates to the 
problem when the secretary's estimate is stated as $80,000 than when it is 
stated as $23,000. Why? Studies have found that people develop estimates by 
starting from an initial anchor, based upon whatever information is provided, 
and adjusting from there to yield a final answer. Slovic and Lichtenstein (1971) 
have provided conclusive evidence that adjustments away from anchors are 
usually not sufficient to negate the effects of the anchor. In all cases, answers 
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are biased toward the initial anchor, even if it is irrelevant. Different starting 
points yield different answers. Tversky and Kahneman (1973) named this phe- 
nomenon anchoring and adjustment. | | 

Tversky and Kahneman (1974) provide systematic, empirical evidence of 
the anchoring effect. For example, in one study, subjects were asked to esti- 
mate the percentage of African countries in the United Nations. For each sub- 
ject, a random number (obtained by an observed spin of a roulette wheel) was 
given as a starting point. From there, subjects were asked to state whether the 
actual value of the quantity was higher or lower than this random value and 
then develop their best estimate for the actual quantily. It was found that the 
arbitrary values from the roulette wheel had a substantial impact on estimates. 
For example, for groups that received 10 countries and 65 countries as starting 
points, the median estimates were 25 and 45, respectively. Thus, even though 
the subjects were aware that the anchor was random and unrelated to the 
judgment task, the anchor had a dramatic effect on their judgment. Interesting- 
ly, paying subjects differentially based upon accuracy did not reduce the mag- 
nitude of the anchoring effect. 296 

Salary negotiations represent a very common context for observing anchor- 
ing in the managerial world. For example, pay increases often come in the form 
of a percentage increase. A firm may have an average Increase of 8 percent, 
with increases for specific employees varying from 3 percent to 13 percent. 
While society has led us to accept such systems as equitable, | believe that 
such a system falls victim to anchoring and leads to substantial inequities. 
What happens if an employee has been substantially underpaid to begin with? 
The pay system described does not rectify past inequities, since a pay in- 
crease of 11 percent will probably leave that employee still underpaid. Con- 
versely, the system would work in the employee's favor had she been overpaid. 
It is common for an employer to ask job applicants their current salaries. Why? 
Employers are searching for a value from which they can anchor an adjust- 
ment. If the employee is worth far more than his current salary, the anchoring 
and adjustment hypothesis predicts that the firm will make an offer below the 
employee's true value. Does this figure provide fully accurate information about 
the true worth of the employee? I think not. Thus, the use of such compensation 
systems accepts past inequities as an anchor and makes inadequate adjust- 
ments from that point. Further, these findings suggest that in deciding what 
offer to make to a potential employee, any anchor that creeps into the discus- 
sion is likely to have an inappropriate effect on the eventual offer, even if the 
anchor is "ignored" as being ridiculous. 

There are numerous examples of the anchoring-and-adjustment phe- 


nomenon in everyday life. 


e |n education, children are tracked by a school system that may categorize 
them into a certain level of performance at an early age. For example, a child 
who is anchored in the C group may meet expectations of mediocre perfor- 
mance. Conversely, a child of similar abilities anchored in the A track may 
strive to meet expectations, which will keep him in the A track. 
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* We have ali fallen victim to the first-impression syndrome when meeting 
someone for the first time. We often place so much emphasis on first impres- 
sions that we do not adjust our opinion appropriately at a later date. 

e Prior to 1973—1974, the speed limit on most interstate highways was 65 miles 
per hour (mph), with a normal cruising speed in the left-hand lane of 70 to 75 
mph. This did not seem to be an extraordinarily unsafe speed to most people. 
After 1974, the speed limit was reduced to 55 mph. Most people changed 
their judgments to view a speed of 70 to 75 mph as extremely unsafe— 
"something only crazy kids would do." Today, the reinstitution of the 65 mph 
limit on nonurban highways has rejustified the safety of the 70 to 75 mph 
speed. 


In a fascinating study of anchoring and adjustment in the real estate market, 
Northcraft and Neale (1987) surveyed an association of real estate brokers, 
who indicated that they believed that they could assess the value of properties 
to within 5 percent of their true or appraised value. Further, they were unan- 
imous in stating that they did not factor the listing price of the property into their 
personal estimate of its "true" value. Northcraft and Neale then asked four 
groups of professional real estate brokers and undergraduate students to esti- 
mate the value of a real house. Both brokers and students were randomly 
assigned to one of four experimental groups. In each group, all participants 
were given a 10-page packet of information about the house that was being 
sold. The packet included not only background on the house, but also consid- 
erable information about prices and characteristics of other houses in the area 
that had recently been sold. The only difference in the information given to the 
four groups was the listing price for the house, which was selected to be +11 
percent, +4 percent, —4 percent, and —11 percent of the actual appraised 
value of the property. After reading the material, all participants toured the 


house, as well as the surrounding neighborhood. Participants were then asked 


for their estimate of the house's price. The final results:suggested that both 
brokers and students were significantly affected by the listing price (the an- 
chor) in determining the value. While the students readily admitted the role that 
the listing price played in their decision-making process, the brokers flatly 
denied their use of the listing price as an anchor for their evaluations of the 
property— despite the. evidence to the contrary. This study provides convinc- 
ing data to indicate that even experts are susceptible to the anchoring bias. 
Furthermore, experts are less likely to realize their use of this bias in making 
decisions. 

Joyce and Biddle (1981) have also provided empirical support for the an- 
choring-and-adjustment effect on practicing auditors of Big Eight accounting 
firms. Specifically, subjects in one condition were asked the following: 


It is well known that many cases of management fraud go undetected even when ` 


competent annual audits are performed. The reason, of course, is that Generally 
Accepted Auditing Standards are not designed specilically to detect executive-level 
management fraud. We are interested in obtaining an estimate from practicing au- 
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ditors of the prevalence of executive-level management fraud as a first step in 
ascertaining the scope of the problem. 


it experience, is the incidence of significant executive- 
и пазат and more than 10 in each 1,000 firms (that is, 1 per- 
cent) audited by Big Eight accounting firms? 
a. Yes, more than 10 in each 1,000 Big Eight clients have significant execu- 
tive-level management fraud. 
b. No, fewer than 10 in each 1,000 Big Eight clients have significant execu- 
tive-level management fraud. 


2, What is your estimate of the number of Big Eight clients per 1,000 that have 
significant executive-level management fraud? 


(Fill in the blank below with the appropriate number.) | 
___ in each 1,000 Big Eight clients have significant executive-level man- 
agement fraud. 
ered only in that subjects were asked whether the 
fraud incidence was more or less than 200 in each 1,000 audited, rather than 
10 in 1,000. Subjects in the former condition estimated a fraud incidence : 
16.52 per 1,000 on average, compared with an estimated fraud incidence ° 
43.11 per 1,000 in the second condition! Here, even professional auditors fell 
icti nchoring and adjustment. 
ЈЕ алеу ta make insufficient adjustments is a direct result of the an- 
choring-and-adjustment heuristic described in the first chapter. Interestingly, 
Nisbett and Ross (1980) present an argument that suggests that the anchor- 
ing-and-adjustment bias itself dictates that it will be very difficult to get you to 
change your decision-making strategies as a result of reading this book. They 
argue that each of the heuristics that we identify are currently serving as your 
cognitive anchors and are central to your current judgment processes. Thus, 
any cognitive strategy that | suggest must be presented and understood in a 
manner that will force you to break your existing cognitive anchors. Based on 
the evidence in this section, this should be a difficult challenge—but one that 15 
important enough to be worth the effort! 


The second condition diff 


Blas 10—Conjunctive and 
Disjunctive Events Bias 


Problem 11: Which of the following appears most likely? 
Which appears second most likely? 
a. Drawing a red marble from a bag containing 50 percent red marbles and 


50 percent white marbles. | 
b. Drawing a red marble seven times in succession, with replacement (a 
selected marble is put back in the bag before the next marble is se- 
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lected), from a bag containing 90 percent red marbles and 10 percent 
white marbles. 

с. Drawing at least one red marble in seven tries, with replacement, from a 
bag containing 10 percent red marbles and 90 percent white marbles. 


The most common answer in ordering the preferences is B-A-C. Interestingly, 
the correct order of likelihood is C (52 percent), A (50 percent), B (48 percent)— 
the exact opposite of the most common intuitive pattern! This result illustrates a 
general bias to overestimate the probability of conjunctive events—events that 
must occur in conjunction with one another (Bar-Hillel, 1973)—and to underesti- 
mate the probability of disjunctive events—events that occur independently 
(Tversky and Kahneman, 1974). Thus, when multiple events all need to occur 
(problem B), we overestimate the true likelihood, while if only one of many events 
needs to occur (problem C), we underestimate the true likelihood. 

| Kahneman and Tversky (1974) explain these effects in terms of the anchor- 
ing-and-adjustment heuristic. They argue that the probability of any one event 
occurring (for example, drawing one red marble) provides a natural anchor for 
the judgment of the total probability. Since adjustment from an anchor is typ- 
ically insufficient, the perceived likelihood of choice B stays inappropriately 
close to 90 percent, while the perceived probability of choice C stays inap- 
propriately close to 10 percent. 

How is each of these biases manifested in an applied context? The over- 
estimation of conjunctive events is a powerful explanation of the timing prob- 
lems in projects that require multistage planning. Individuals, businesses, and 
governments frequently fall victim to the conjunction-events bias in terms of 
timing and budgets. Public works projects seldom finish on time or on budget. 
New product ventures frequently take longer than expected. 

Consider the following: 


You are planning a construction project that consists of five distinct compo- 
nents. Your schedule is tight, and every component must be on time in order 
to meet a contractual deadline. Will you meet this deadline? 

You are managing a consulting project that consists of six teams, each of 
which is analyzing a different alternative. The alternatives cannot be com- 
pared until all teams complete their portion. Will you meet the deadline? 
After three years of study, doctoral students typically dramatically overesti- 
mate the likelihood of completing their dissertations within a year. At this 
Stage, they typically can tell you how long each remaining component will 
take. Why do they not finish in one year? 


The underestimation of disjunctive events explains our surprise when an 
unlikely event occurs. As Tversky and Kahneman (1974) argue, “A complex 
system, such as a nuclear reactor or the human body, will malfunction if any of 
its essential component fails. Even when the likelihood of failure in each com- 
ponent is slight, the probability of an overall failure can be high if many compo- 
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nents are involved." In Normal Accidents, Perrow (1984) argues against the 
safety of technologies like nuclear reactors and DNA research. He fears that 
society significantly underestimates the likelihood of system failure because of 
our judgmental failure to realize the multitude of things that can go wrong in 
these incredibly complex and interactive systems. 

The understanding of our underestimation of disjunctive events also has its 
positive side. Consider the following: 


It's Monday evening (10:00 P.M.). You get a phone call telling you that you must be at 
the Chicago office by 9:30 A.M. the next morning. You call all five airlines that have 
flights that get into Chicago by 9:00 a.m. Each has one flight, and all the flights are 
booked. When you ask the probability of getting on each of the flights if you show up 
at the airport in the morning, you are disappointed to hear probabilities of 30 percent, 
25 percent, 15 percent, 20 percent, and 25 percent. Consequently, you do not 
expect to get to Chicago in time. р 
In this case, the disjunctive bias leads you to expect the worst. In fact, if the 
probabilities given by the airlines are unbiased, and independent there is a 73 
percent chance of getting on one of the flights (assuming that you can arrange 
to be at the right ticket counter at the right time)! 


Blas 11—Overconfidence 


Problem 12: Listed below are 10 uncertain, quantities. Do not look up any 
information on these items. For each, write down your best estimate of the 
quantity. Next, put a lower and upper bound around your estimate, such that 
you are 98 percent confident that your range surrounds the actual quantity. 


a. Mobil Oils sales in 1987 

b. IBM's assets in 1987 

c. Chrysler's profit in 1987 | 

d. The number of U.S. industrial firms in 1987 with sales 

greater than those of Consolidated Papers 

e. The U.S. gross national product in 1945 

. The amount of taxes collected by the U.S. Internal 
Revenue Service in 1970 

g. The length (in feet) of the Chesapeake Bay Bridge- 

Tunnel : 

h. The area (in square miles) of Brazil 

l. The size of the black population of San Francisco in 
1970 

j. The dollar value of Canadian exports of lumber in 1977 





ШП! 


How many of your 10 ranges will actually surround the true quantities? If you set 
your ranges so that you were 98 percent confident, you should expect to 
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correctly bound approximately 9.8 or 9 to 10 of the 10 quantities. Let's look at the 
correct answers: (a)  $51,223,000,000; (Ы)  $63,688,000,000; (c) 
$1,289,700,000; (d) 381; (e) $212,300,000,000; (f) $195,722,096,497; (g) 
93,203; (h) 3,286,470; (i) 96,078; (j) $2,386,282,000. 

How many of your ranges actually surrounded the true quantities? If you 
surround 9—10, we can conclude that you were appropriately confident in your 
estimation ability. Most people only surround between 3 (30 percent) and 7 (70 
percent), despite claiming a 98 percent confidence that each of the ranges will 
surround the true value. Why? Most of us are overconfident in our estimation 
abilities and do not acknowledge the actual uncertainty that exists. 

In Alpert and Рата (1969) initial demonstration of overcontidence based 
upon 1,000 observations (100 subjects on 10 items), 42.6 percent of quantities 
fell outside 90% confidence ranges. Since then, overconfidence has been 
identified as a common judgmental pattern and demonstrated in a wide variety 
of settings. For example, Fischhoff, Slovic, and Lichtenstein (1977) found thal 
subjects who assigned odds of 1,000: 1 of being correct were correct only 81 
to 88 percent of the time. For odds of 1,000,000: 1, their answers were correct 
only 90 to 96 percent of the time! Hazard and Peterson (1973) identified over- 
confidence among members of the armed forces, while Cambridge and 
Shreckengost (1980) found extreme омегсопћдепсе in CIA agents. 

The most well-established finding in the overconfidence literature is the 
tendency of people to be most overconfident of the correctness of their an- 
swers when asked to respond to questions of moderate to extreme difficulty 
(Fischhoff, Slovic, and Lichtenstein, 1977; Koriat, Lichtenstein, and Fischhoff, 
1980; Lichtenstein and Fischhoff, 1977, 1980). That is, as subjects knowledge, 
of a question decreases, they do not correspondingly decrease their level of 
confidence (Nickerson and McGoldrick, 1965; Pitz, 1974). However, subjects 
typically demonstrate no overconfidence, and often some underconfidence, to 
questions with which they are familiar. Thus we should be most alert to over- 
confidence in areas outside of our expertise. 

There is a large degree of controversy over the explanations of why over- 
confidence exists (see Lichtenstein, Fischhoff, and Phillips [1982] for an exten- 
sive discussion) Tversky and Kahneman (1974) explain overconfidence in 
terms of anchoring. Specifically, they argue that when individuals are asked to 
set a confidence range around an answer, their initial estimate serves as an 
anchor which biases their estimation of confidence intervals in both directions. 
As explained earlier, adjustments from an anchor are usually insufficient, result- 
ing in an overly narrow confidence band. 

In their review of the overconfidence literature, Lichtenstein, Fischhoff, and 
Phillips (1982) suggest two viable strategies for eliminating overconfidence. 
First, they have found that giving people feedback about their overconfidence 
based on their judgments has been moderately successful at reducing this 
bias. Second, Koriat, Lichtenstein, and Fischhoff (1980) found that asking peo- 
ple to explain why their answers might be wrong (or far off the mark) can 
decrease overconfidence by getting subjects to see contradictions in their 
judgment. 
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Why should you be concerned about overconfidence? After all, it has proba- 
bly given you the courage in the past to attempt endeavors that have stretched 
your abilities. However, consider the following: | 


• You are a medical doctor and are considering performing a difficult opera- 
tion. The patient's family needs to know the likelihood of his surviving the 
operation. You respond "95 percent.” Are you guilty of malpractice if you tend 
to be overconfident in your projections of survival? 

• You work for the Nuclear Regulatory Commission and are 99.9 percent confi- 
dent that a reactor will not leak. Can we trust your confidence? If not, can we 
run the enormous risks of overconfidence In this domain? 

e Your firm has been threatened with a multimillion dollar law suit. If you lose, 
your firm is out of business. You are 98 percent confident that the firm will not 
lose in court. Is this degree of certainty sufficient for you to recommend 
rejecting an out-of-court settlement? Based on what you know now, are you 
still comfortable with your 98 percent estimate? 

• You have developed a market plan for a new product. You are so confident in 


your plan that you have not developed any contingencies for early market . 


failure. The plan of attack falls apart. Will your overconfidence wipe out any 
hope of expediting changes in the marketing strategy? 


In each of these examples, we have introduced serious problems thal can 
result from the tendency to be overconfident. Thus, while confidence in your 
abilities is necessary for achievement in life, and perhaps to inspire confidence 
in others, you may want to monitor your overconfidence to achieve more ећес- 


tive professional decision making. 


Summary The need for an initial anchor weighs strongly in our decision- 
making processes when we try to estimate likelinoods (such as the probability 
of on-time project completion) or establish values (like what salary to offer), 
Experience has taught us thal starting from somewhere is easier than starting 
from nowhere in determining such figures. However, as the last three biases 
(insufficient anchor adjustment, conjunctive and disjunctive events bias, and 
overconfidence) show, we frequently overrely on these anchors and seldom 
question their validity or appropriateness in a particular situation. AS with the 
other heuristics, we frequently fail even to realize that this heuristic is impacting 


our judgments. 


TWO MORE GENERAL BIASES 
Blas 12—The Confirmation Trap 


Problem 13: (Adapted from Einhorn and Hogarth, 1978) 
It is claimed that when a particular analyst predicts a rise in the market, the 
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market always rises. You are to check this claim, Exami 
this claim. Examine the i ior 
available about the following four events (cards); | riformation 


— s 


Card 1 Card 2 Card 3 
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Prediction: Prediction: Outcome: ciim 
Favorable . Unfavorable Rise in the Fall in the 


report report market 
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erie he sou id that you need to check the analyst's claim? 


Consider the two most common resp 5 | 
ponses: (1) "Сага 1 (only)—that is the 
huis da | know has a favorable report and thus allows me to see у ыш 
aoe e report is actually followed by a rise in the market" and (2) "Cards 1 
hay e A ced. a а test, while card 3 allows те to see whether 
report when | know the market rose." Logi 
people think that at least one of these tw egg Sen 
nk | о common responses is logi 
However, both strategies demonst | eris 
, bol tegi : rate the tendency to search for confirmi 
Ew e disconfirming, evidence. Einhorn and Hogarth (1978) argue АГ 
is the correct answer to this quiz item. Why? Consider the following logic: 


se и me to test the claim that а rise іп the market will add confirmin 

cs shag кш өф ра паки will fully disconfirm the claim, since the claim 

| vays rise following a favorable report. Card 

information, since the claim does not г ES ae рве пе faa 
аы address unfavorable report 

; | g evidence to card 1, it provid ique | 

tion, since it cannot discentitm the chai ага 1, es no unique informa- 

m. That is, if ап unfavorabl 
on card 3, then the event is not addressed by the claim, Final SiS por wes made 
"fave à y the claim. Finally, card 4 is critical. If i 
says "favorable report" on the other side, the claim is Шоо тө на 


If you chose cards 1 and 3 i 
| , you may have obtained a wealth of confirm 
у | а! 
тлен а decis in brass aie accept the claim. Only by alae 
ага ntial for disconfirmation of the hypothesi 
few subjects select card 4? Most of u nti i item wens ит 
rd « 5 seek confirmatory evidence and | 
ue ew pr qe ganong information from our decision process ner 
| : ot possible to know somethi i | ing f 
Sone ble пое ing to be true without checking for 
The initial demonstration of our tend i l 
| Í | | епсу to ignore disconfirming informati 
Mim | ge 48 ac of projects by Wason (1960, 1968a, 19685) In the je 
. We presented subjects with the three-numb : 
F ү ; | - ег веди 
2-4-6. The subject's task was to discover the numeric rule to-which the те 


TWO MORE GENERAL BIASES 37 


numbers conformed. To determine the rule, subjects were allowed to generate 
other sets of three numbers that the experimenter would classify as either 
conforming or not conforming to the rule. At any point, subjects could stop 
when they thought that they had discovered the rule. How would you approach 
this problem? 

Wason's rule was "any three ascending numbers"—a solution which re- 
quired the accumulation of disconfirming, rather than confirming, evidence. For 
example, if you thought the rule included "the difference between the first two 
numbers equaling the difference between the last two numbers" (a common 
expectation), you must try sequences that do not conform to this rule to find the 
actual rule. Trying the sequences 1-2-3, 10-15-20, 122-126-130, and so on, will 
only lead you into the confirmation trap. In Wason's (1960) experiment, only 6 
out of 29 subjects found the correct rule the first time that they thought they 
knew the answer. Wason concluded that obtaining the correct solution necessi- 
tates "a willingness to attempt to falsify hypotheses, and thus to test those 
intuitive ideas which so often carry the feeling of certitude" (p. 139). 

This result was also observed by Einhorn and Hogarth (1978) with a sample 
of 23 statisticians. When that group responded to a problem very similar to 
problem 13, eleven asked for card 1; one asked for card 1 or 3; one asked for 
any one card; two asked for card 1 or 4: three asked for card 4 alone; and only 
five trained statisticians asked for cards 1 and 4. Thus, this group tended to 
realize the worthlessness of card 3 but failed to realize the importance of card 
4. This leads to the conclusion that the tendency to exclude disconfirming 
information in the search process is not eliminated by the formal scientific 
training that is expected of statisticians, 

It is easy to observe the confirmation trap in your decision-making pro- 
cesses. You make a tentative decision (to buy a new car, to hire a particular 
employee, to start research and development on a new product line). Do you 
search for data that support your decision before making the final commit- 
ment? Most of us do. However, the existence of the confirmation trap implies 
that the search for challenging, or disconfirming, evidence will provide the 
most useful insights. For example, in confirming your decision to hire a particu- 
lar employee, it is probably easy to find supporting positive information on the 
individual, but in fact the key issue may be the degree to which negative 
information on this individual, as well as positive information on another poten- 


tial applicant, also exists. 


Blas 13—Hindsight 


Consider the following scenarios: 
„ You are an avid football fan, and you are watching a critical game in which 


your team is behind 35—31. With three seconds left, and the ball on the 
opponent's three-yard line, the quarterback unsuccessfully calls a pass 
play into the corner of the endzone. You immediately respond, “I knew that 
he shouldn't have called that play.” 
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* You are riding in an unfamiliar area, and your spouse is driving. You ap- 
proach an unmarked fork in the road, and your spouse decides to go to 
the right. Four miles and fifteen minutes later, it is clear that you are lost. 
You blurt out, "I knew that you should have turned left at the fork." 


* A manager who works for you hired a new supervisor last year. You were 
well aware of the choices he had at the time and allowed him to choose the 
new employee on his own. You have just received production data on 

- every supervisor. The data on the new supervisor are terrible. You call in 
the manager and claim, "There was plenty of evidence that he (the super- 
visor) was not the man for the job." t^ 

* As director of marketing in a consumer-goods organization, you have just 
presented the results of an extensive six-month study on current con- 
sumer preferences for the products manufactured by your organization. 
After the conclusion of your presentation, a senior vice-president re- 
sponds, "I don't know why we spent so much time and money to collect 
these data. | could have told you what the results were going to be." 


Do you recognize yourself? Do you recognize someone else? Each scenario is 
representative of a phenomenon that has been named "the Monday morning 
quarterback syndrome" (Fischhoff, 1975b), "the knew-it-all-along effect" (Wood, 
1978), “creeping determinism" (Fischhoff, 1975a, 1975b, 1980), and "the 
hindsight bias" (Fischhoff, 1975a, 1975b). This body of research demonstrates 
that people are typically not very good at recalling or reconstructing the way an 
uncertain situation appeared to them before finding out the results of the 
decision. What play would have you called? Did you really know that your 
spouse should have turned left? Was there really evidence that the selected 
supervisor was not the man for the job? Could the senior vice-president really 
have predicted the results of the survey? Perhaps our intuition is sometimes 
accurate, but we tend to overestimate what we knew and distort our beliefs 
about what we knew beforehand based upon what we later found out. The 
phenomenon occurs when people look back on the judgment of others, as well 
as of themselves. 

Fischhoff has provided substantial evidence of the prevalence of the 
hindsight effect (1975a, 1975b, 1977; Fischhoff and Beyth, 1975; Slovic and 
Fischhoff, 1977). For example, Fischhoff (1 975a) examined the differences be- 
tween hindsight and foresight in the context of judging historical events and 
clinical instances. In one study, subjects were divided into five groups and 
asked to read a passage about the war between the British and Gurka forces in 
1814. One group was not told the result of the war. The remaining four groups 
of subjects were told either that (1) the British won; (2) the Gurkas won; (3) a 
military stalemate was reached with no peace settlement; or (4) a military 
stalemate was reached with a peace settlement. Obviously, only one group 
was told the truthful outcome—(1) in this case, Each subject was then asked 
what his or her subjective assessments of the probability of each of the out- 
comes would have been without the benefit of knowing the reported outcome. 
Based upon this and other varied examples, the strong, consistent finding was 
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Table 2.2 Summary of 13 Biases Presented in Chapter 2 
Е ае ca OU OT ML ЗЕ eR Ci a) Sual as Aa з E 


Bias | Description: 


Blases Emanating from the Avallability Heurlatic 
1 Ease of recall Individuals judge events thal are 
more easily recalled from memory, 
based upon vividness or recency, to 
be more numerous than events of 
equal frequency whose instances are 
less easily recalled. 


2 Retrievability Individuals are biased in their 
assessments of the frequency of 
events based upon how their 
memory structures affect the search 
process. 


3 Presumed associations Individuals tend to overestimate the 
probabilily of two events co- 
occurring based upon the number of 
similar associations that are easily 
recalled, whether from experience or 
social influence. 


| Blases Етапа тд from the Representativeness Heurlatic 
4 insensitivity to base rates Individuals tend to ignore base rates 
in assessing the likelihood of events 
when any other descriptive 
information is provided—-even if it is 
irrelevant. 


9 Insensitivity to sample size Individuals frequently fail to 
appreciate the role of sample size in 
assessing the reliabilily of sample 


information, 


6 Misconceptions of chance Individuals expect that a sequence 
of data generated by a random 
process will look "random," even 
when the sequence is too short for 
those expectations to be stalistically 
valid. 

Individuals tend to ignore Ihe fact 
that extreme events tend to regress 
to the mean on subsequent trials, 


Individuals falsely judge that 
conjunctions (Iwo events co- 
occurring) are more probable Ihan a 
more global sel of occurrences of 
which the conjunction is a subset. 


7 Regression to the mean 


8 The conjunction fallacy 





(continued) 


Table 2.2 (Continued) 


9 insufficient anchor adjustment 
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Bias Description 


| ring and Adjustment 
fre oe rss asa E make estimates for 
values based upon an initial value 
(derived from past events, random 
assignment, or whalever information 
is available) and typically make 
insufficient adjustments from that 
anchor when establishing a final 
value. 


Individuals exhibil a bias toward 
overestimating the probability of 
conjunctive events and 
underestimating the probability of 
disjunctive events. 

Individuals tend to be overconfident 
of the infallibility of their judgments - 
when answering moderately to 
extremely difficult questions. 


10 Conjunctive and disjunctive events bias 


11 Overconfidence 


Two More General Blases 


Ë š: 1 a " a 
Qe, [Ho Ona NOR Мяр information for what they think is true 


and neglect the search for 
disconfirmatory evidence. 


After finding out whether or not an 
event occurred, individuals tend to 
overestimate the degree to whicn 
they would have predicted the 

- correct outcome. 


13 Hindsighl 
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heuristics because, on average, any loss in quality of decisions Is outweig e 

by the time saved. However, we argue against blanket acceptance of heuris- 
tics based upon this logic. First, as we have demon strated in this chapter, пег 
are many instances in which the loss in the quality of decisions far aie | 

the time saved by the use of the heuristics. Second, the foregoing logic e 
gests that we have voluntarily accepted tradeoffs associated with the HM O 
heuristics. But in reality, we have nol: Most of us are unaware of their is ph 
and their on-going impact upon our decision making. The difficulty wi 


heuristics is that we typically do not recognize that we are using them, and we 


consequently fail to distinguish between situations in which their use is more 


and less appropriate. 


Individuals tend to seek confirmatory . 


voc 


42 BIASES 


To emphasize the distinction between the legitimate and illegitimate uses of 
heuristics, reconsider problem 6. In that problem, subjects tend to predict (һа! 
Mark Is more likely to take a job in “management of the arts," despite the fact 
that the contextual data overwhelmingly favor "management consulting.” The 
representativeness heuristic, in this case, prevents us from appropriately incor- 
porating relevant base-rate data. However, if the choice of “management con- 
sulting" were replaced with another less common career path for an MBA from 
a prestigious university (such as management in the steel industry), then the 
representativeness heuristic is likely to lead to an accurate prediction. That is, 
when base-rate data are unavailable or irrelevant (that is, the choices have the 
same base-rate), the representativeness heuristic provides a reasonably good 
cognitive tool for matching Mark to his most likely career path. The key to 
improved judgment, therefore, lies in learning to distinguish between appropri- 
ate and inappropriate uses of heuristics. This chapter provides a start in learn- 
ing to make this distinction. 

This book's examination of biases and heuristics does not end here. In fact, 
in the next three chapters we will continue to examine biases and heuristics in 
the areas of risk, the escalation of commitment, and creativity. The latter part of 
the book will examine biases in the context of more complicated multiparty 
decision-making situations. 
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A Gentle Introduction to 
Genetic Algorithms 


In this chapter, we introduce genetic algorithms: what they are, where they came 
from, and how they compare to and differ from other search procedures. We 
illustrate how they work with a hand calculation, and we start to understand their 
power through the concept of a schema or similarity template. 


WHAT ARE GENETIC ALGORITHMS? 


Genetic algorithms are search algorithms based on the mechanics of natural se- 
lection and natural genetics. They combine survival of the fittest among string 
structures with a structured yet randomized information exchange to form a 
search algorithm with some of the innovative flair of human search. In every 
generation, a new set of artificial creatures (strings) is created using bits and 
pieces of the fittest of the old; an occasional new part is tried for good measure. 
While randomized, genetic algorithms are no simple random walk. They effi- 
ciently exploit historical information to speculate on new search points with ex- 
pected improved performance. 

Genetic algorithms have been developed by John Holland, his colleagues, and 
his students at the University of Michigan. The goals of their research have been 
twofold: (1) to abstract and rigorously explain the adaptive processes of natural 
systems, and (2) to design artificial systems software that retains the important 
mechanisms of natural systems. This approach has led to important discoveries 
in both natural and artificial systems science. 

The central theme of research on genetic algorithms has been robustness, 
the balance between efficiency and efficacy necessary for survival in many differ- 
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ent environments. The implications of robustness for artificial systems are mani- 
fold. If artificial systems can be made more robust, costly redesigns can be 
reduced or eliminated. If higher levels of adaptation can be achieved, existing 
systems can perform their functions longer and better. Designers of artificial sys- 
tems—both software and hardware, whether engineering systems, computer sys- 
tems, or business systems—can only marvel at the robustness, the efficiency, and 
the flexibility of biological systems. Features for self-repair, self-guidance, and re- 
production are the rule in biological systems, whereas they barely exist in the 
most sophisticated artificial systems. 

Thus, we are drawn to an interesting conclusion: where robust performance 
is desired (and where is it not?), nature does it better; the secrets of adaptation 
and survival are best learned from the careful study of biological example. Yet 
we do not accept the genetic algorithm method by appeal to this beauty-of-nature 
argument alone. Genetic algorithms are theoretically and empirically proven to 
provide robust search in complex spaces. The primary monograph on the topic 
is Holland's (1975) Adaptation in Natural and Artificial Systems. Many papers 
and dissertations establish the validity of the technique in function optimization 
and control applications. Having been established as a valid approach to problems 
requiring efficient and effective search, genetic algorithms are now finding more 
widespread application in business, scientific, and engineering circles. The rea- 
sons behind the growing numbers of applications are clear. These algorithms are 
computationally simple yet powerful in their search for improvement. Further- 
more, they are not fundamentally limited by restrictive assumptions about the 
search space (assumptions concerning continuity, existence of derivatives, uni- 
modality, and other matters ). We will investigate the reasons behind these attrac- 
tive qualities; but before this, we need to explore the robustness of more widely 
accepted search procedures. 


ROBUSTNESS OF TRADITIONAL OPTIMIZATION 
AND SEARCH METHODS 


This book is not a comparative study of search and optimization techniques. 


Nonetheless, it is important to question whether conventional search methods 
meet our robustness requirements. The current literature identifies three main 
types of search methods: calculus-based, enumerative, and random. Let us ex- 
amine each type to see what conclusions may be drawn without formal testing. | 

Calculus-based methods have been studied heavily. These subdivide into two | 
main classes: indirect and direct. Indirect methods seek local extrema by solving 
the usually nonlinear set of equations resulting from setting the gradient of the 
objective function equal to zero. This is the multidimensional generalization of 
the elementary calculus notion of extremal points, as illustrated in Fig. 1.1. Given 
a smooth, unconstrained function, finding a possible peak starts by restricting 
search to those points with slopes of zero in all directions. On the other hand, 
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FIGURE 1.1 The single-peak function is easy for calculus-based methods. 


direct (search) methods seek local optima by hopping on the function and mov- 
ing in a direction related to the local gradient. This is simply the notion of bill- 
climbing: to find the local best, climb the function in the steepest permissible 
direction. While both of these calculus-based methods have been improved, 
extended, hashed, and rehashed, some simple reasoning shows their lack of 
robustness. 

First. both methods are local in scope; the optima they seek are the best in a 
neighborhood of the current point. For example, suppose that Fig. 1.1 shows a 
portion of the complete domain of interest; a more'complete picture is shown in 
Fig. 1.2. Clearly, starting the search or zero-finding procedures in the neighbor- 
hood of the lower peak will cause us to miss the main event (the higher peak ). 
Furthermore, once the lower peak is reached, further improvement must be 
sought through random restart or other trickery. Second, calculus-based methods 
depend upon the existence of derivatives (well-defined slope values). Even if we 
allow numerical approximation of derivatives, this is a severe shortcoming. Many 
practical parameter spaces have little respect for the notion of a derivative and 
the smoothness this implies. Theorists interested in optimization have been too 
willing to accept the legacy of the great eighteenth and nineteenth-century math- 
ematicians who painted a clean world of quadratic objective functions, ideal con- 
straints, and ever present derivatives. The real world of search is fraught with 
discontinuities and vast multimodal, noisy search spaces as depicted in a less 
calculus-friendly function in Fig. 1.3. It comes as no surprise that methods de- 
pending upon the restrictive requirements of continuity and derivative existence 
are unsuitable for all but a very limited problem domain. For this reason and 
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f(x, y) 





FIGURE 1.2 The multiple-peak function causes a dilemma. Which hill should 
we climb? 


because of their inherently local scope of search, we must reject calculus-based 
methods. They are insufficiently robust in unintended domains. 

Enumerative schemes have been considered in many shapes and sizes. The 
idea is fairly straightforward; within a finite search space, or a discretized infinite 
search space, the search algorithm starts looking at objective function values at 
every point in the space, one at a time. Although the simplicity of this type of 


f(x) 
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FIGURE 1.3 Many functions are noisy and discontinuous and thus unsuitable 
for search by traditional methods. 


Robustness of Traditional Optimization and Search Methods 5 


algorithm is attractive, and enumeration is a very human kind of search (when 
the number of possibilities is small), such schemes must ultimately be discounted 
in the robustness race for one simple reason: lack of efficiency. Many practical 
spaces are simply too large to search one at a time and still have a chance of using 
the information to some practical end. Even the highly touted enumerative 
scheme dynamic programming breaks down on problems of moderate size and 
complexity, suffering from a malady melodramatically labeled the “curse of di- 
mensionality” by its creator (Bellman, 1961). We must conclude that less clever 
enumerative schemes are similarly, and more abundantly, cursed for real 
problems. | 

Random search algorithms have achieved increasing popularity as research- 
ers have recognized the shortcomings of calculus-based and enumerative 
schemes. Yet, random walks and random schemes that search and save the best 
must also be discounted because of the efficiency requirement. Random searches, 
in the long run, can be expected to do no better than enumerative schemes. In 
our haste to discount strictly random search methods, we must be careful to 
separate them from randomized techniques. The genetic algorithm is an example 
of a search procedure that uses random choice as a tool to guide a highly exploi- 
tative search through a coding of a parameter space. Using random choice as a 
tool in a directed search process seems strange at first, but nature contains many 
examples. Another currently popular search technique, simulated annealing, 
uses random processes to help guide its form of search for minimal energy states. 
A recent book (Davis, 1987) explores the connections between simulated an- 
nealing and genetic algorithms. The important thing to recognize at this juncture 
is that randomized search does not necessarily imply directionless search. 

While our discussion bas been no exhaustive examination of the myriad 
methods of traditional optimization, we are left with a somewhat unsettling con- 
clusion: conventional search methods are not robust. This does not imply that 
they are not useful. The schemes mentioned and countless hybrid combinations 
and permutations have been used successfully in many applications; however, as 
more complex problems are attacked, other methods will be necessary. To put 
this point in better perspective, inspect the problem spectrum of Fig. 1.4. In the 
figure a mythical effectiveness index is plotted across a problem continuum for a 
specialized scheme, an enumerative scheme, and an idealized robust scheme. The 
gradient technique performs well in its narrow problem class, as we expect, but 
it becomes highly inefficient (if useful at all) elsewhere. On the other hand, the 
enumerative scheme performs with egalitarian inefficiency across the spectrum 
of problems, as shown by the lower performance curve. Far more desirable would 
be a performance curve like the one labeled Robust Scheme. It would be worth- 
while sacrificing peak performance on a particular problem to achieve a relatively 
high level of performance across the spectrum of problems. (Of course, with 
broad, efficient methods we can always create hybrid schemes that combine the 
best of the local search method with the more general robust scheme. We will 
have more to say about this possibility in Chapter 5.) We shall soon see how 
genetic algorithms help fill this robustness gap. | 
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FIGURE 1.4 Many traditional schemes work well in a narrow problem domain. 
Enumerative schemes and random walks work equally inefficiently across a broad 
spectrum. A robust method works well across a broad spectrum of problems. 


THE GOALS OF OPTIMIZATION 


Before examining the mechanics and power of a simple genetic algorithm, we 
must be clearer about our goals when we say we want to optimize a function or 
a process. What are we trying to accomplish when we optimize? The conven- 
tional view is presented well bv Beightler, Phillips, and Wilde (1979, p. 1): 


Man's longing for perfection finds expression in the theory of optimiza- 
tion. It studies how to describe and attain what is Best, once one knows 
how to measure and alter what is Good or Bad... . Optimization theory 
encompasses the quantitative study of optima and methods for finding 
them. 


Thus optimization seeks to improve performance toward some optimal point or 
points. Note that this definition has two parts: (1) we seek improvement to ap- 
proach some (2) optimal point. There is a clear distinction between the process 
of improvement and the destination or optimum itself. Yet, in judging optimiza- 
tion procedures we commonly focus solely upon convergence (does the method 
reach the optimum?) and forget entirely about interim performance. This empha- 
sis stems from the origins of optimization in the calculus. It is not, however, a 
natural emphasis. | 
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Consider a human decision maker, for example, a businessman. How do we 
judge his decisions? What criteria do we use to decide whether he has done 
a good or bad job? Usually we say he has done well when he makes adequate 
selections within the time and resources allotted. Goodness is judged relative 
to his competition. Does he produce a better widget? Does he get it to market 
more efficiently? With better promotion? We never judge a businessman by an 
attainment-of-the-best criterion; perfection is all too stern a taskmaster. As a re- 
sult, we conclude that convergence to the best is not an issue in business or in 
most walks of life; we are only concerned with doing better relative to others. 
Thus. if we want more humanlike optimization tools, we are led to a reordering 
of the priorities of optimization. The most important goal of optimization is im- 
provement. Can we get to some good, "satisficing" (Simon. 1969 ) level of per- 
formance quickly? Attainment of the optimum is much less important for 
complex svstems. It would be nice to be perfect: meanwhile, we can only strive 
to improve. In the next chapter we watch the genetic algorithm for these quali- 
ties: here we outline some important differences between genetic algorithms and 
more traditional methods. 


HOW ARE GENETIC ALGORITHMS DIFFERENT FROM 
TRADITIONAL METHODS? 


In order for genetic algorithms to surpass their more traditional cousins in the 
quest for robustness. GAs must differ in some very fundamental ways. Genetic 
algorithms are different from more normal optimization and search procedures 
in four ways: 


1 GAs work with a coding of the parameter set, not the parameters themselves. 

2. GAs search from a population of points, not a single point. 

. GAs use payoff (objective function) information, not derivatives or other 
auxiliary knowledge. | 

. GAs use probabilistic transition rules, not deterministic rules. 


. Ы 


к. 


Genetic algorithms require the natural parameter set of the optimization 
problem to be coded as a finite-length string over some finite alphabet. As an 
example, consider the optimization problem posed in Fig. 1.5. We wish to maxi- 
mize the function f(x) = x? on the integer interval [0, 31]. With more traditional 
methods we would be tempted to twiddle with the parameter ~, turning it like 
the vertical hold knob on a television set, until we reached the highest objective 
function value. With GAs, the first step of our optimization process is to code the 
parameter x as a finite-length string. There are many ways to code the x param- 
eter, and Chapter 3 examines some of these in detail. At the moment, let's con- 
sider an optimization problem where the coding comes à bit more naturally. 

Consider the black box switching problem illustrated in Fig. 1.6. This prob- 
lem concerns a black box device with a bank of five input switches. For every 
setting of the five switches, there is an output signal f. mathematically f = f(s). 
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FIGURE 1.5 А simple function optimization example, the function f(x) = x? on 
the integer interval [0, 31]. 


where s is a particular setting of the five switches. The objective of the problem 
is to set the switches to obtain the maximum possible f value. With other meth- 
ods of optimization we might work directlv with the parameter set (the switch 
settings) and toggle switches from one setting to another using the transition 
rules of our particular method. With genetic algorithms, we first code the 
switches as a finite-length string. A simple code can be generated by considering 
a string of five 15 and O's where each of the five switches is represented by a 1 if 
the switch is on and a O if the switch is off. With this coding, the string 11110 
codes the setting where the first four switches are on and the fifth switch is off. 
some of the codings introduced later will not be so obvious, but at this juncture. 
we acknowledge that genetic algorithms use codings. Later it will be apparent 
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FIGURE 1.6 А black box optimization problem with five on-off switches illus- 
trates the idea of a coding and a payoff measure. Genetic algorithms only require 
these two things: they don’t need to know the workings of the black box. 
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that genetic algorithms exploit coding similarities in a very general way; as a 
result, they are largely unconstrained by the limitations of other methods ( con- 
tinuity, derivative existence, unimodality, and so on ). 

In many optimization methods, we move gingerly from a single point in the 
decision space to the next using some transition rule to determine the next point. 
This point-to-point method is dangerous because it is a perfect prescription for 
locating false peaks in multimodal ( many-peaked ) search spaces. By contrast, GAS 
work from a rich database of points simultaneously (a population of strings), 
climbing many peaks in parallel; thus, the probability of finding a false peak is 
reduced over methods that go point to point. As an example, let's consider our 
black box optimization problem (Fig. 1.6) again. Other techniques for solving 
this problem might start with one set of switch settings, apply some transition 
rules, and generate a new trial switch setting. A genetic algorithm starts with a 
population of strings and thereafter generates successive populations of strings. 
For example, in the five-switch problem, a random start using successive coin 
flips (head = 1, tail = 0) might generate the initial population of size n = 4 
(small by genetic algorithm standards): 


01101 
11000 
01000 
10011 


After this start, successive populations are generated using the genetic algorithm. 
By working from a population of well-adapted diversity instead of a single point. 
the genetic algorithm adheres to the old adage that there is safety in numbers; 
we will soon see how this parallel flavor contributes to a genetic algorithm's 
robustness. 

Many search techniques require much auxiliary information in order to work 
properly. For example, gradient techniques need derivatives (calculated analyti- 
cally or numerically) in order to be able to climb the current peak. and other 
local search procedures like the greedy techniques of combinatorial optimization 
(Lawler, 1976; Syslo, Deo, and Kowalik, 1983) require access to most if not all 
tabular parameters. By contrast, genetic algorithms have no need for all this aux- 
iliary information: GAs are blind. To perform an effective search for better and 
better structures, they only require payoff values (objective function values ) as- 
sociated with individual strings. This characteristic makes a GA a more canonical 
method than many search schemes. After all, every search problem has a metric 
(or metrics) relevant to the search; however, different search problems have 
vastly different forms of auxiliary information. Only if we refuse to use this aux- 
iliary information can we hope to develop the broadly based schemes we desire. 
On the other hand, the refusal to use specific knowledge when it does exist can 
place an upper bound on the performance of an algorithm when it goes head to 
head with methods designed for that problem. Chapter 5 examines ways to use 
nonpayoff information in so-called knowledge-directed genetic algorithms; how- 
ever, at this juncture we stress the importance of the blindness assumption to 
pure genetic algorithm robustness. 
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Unlike many methods, GAs use probabilistic transition rules to guide their 
search. To persons familiar with deterministic methods this seems odd, but the 
use of probability does not suggest that the method is some simple random 
search; this is not decision making at the toss of a coin. Genetic algorithms use 
random choice as a tool to guide a search toward regions of the search space with 
likely improvement. | | 

Taken together, these four differences—direct use of a coding, search from a 
population, blindness to auxiliary information, and randomized operators—con- 
tribute to a genetic algorithm's robustness and resulting advantage over other 
more commonly used techniques. The next section introduces a simple three- 
operator genetic algorithm. 


A SIMPLE GENETIC ALGORITHM 


The mechanics of a simple genetic algorithm are surprisingly simple, involving - 
nothing more complex than copying strings and swapping partial strings. The 
explanation of whv this simple process works is much more subtle and powerful. 
Simplicity of operation and power of effect are two of the main attractions of the 
genetic aigorithm approach. 

The previous section pointed out how genetic algorithms process popula- 
tions of strings. Recalling the black box switching problem, remember that the 
initial population had four strings: 

01101 

11000 š 

01000 

10011 


Also recall that this population was chosen at random through 20 successive flips 
of an unbiased coin. We now must define a set of simple operations that take this 
initial population and generate successive populations that (we hope) improve 
over time. 

A simple genetic algorithm that yields good results in many practical prob- 
lems is composed of three operators: 


. Reproduction 
. Crossover 
. Mutation 


US {ш == 


Reproduction is a process in which individual strings are copied according 
to their objective function values, f (biologists call this function the fitness func- 
tion ). Intuitively, we can think of the function f as some measure of profit, utility, 
or goodness that we want to maximize. Copving strings according to their fitness 
values means that strings with a higher value have a higher probabilitv of con- 
tributing one or more offspring in the next generation. This operator, of course, 
is an artificial version of natural selection. a Darwinian survival of the fittest 
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TABLE 1.1 Sample Problem Strings and Fitness Values 








No. String Fitness — % of Total 
1 01101 169 14.4 

2 11000 576 49.2 

5 01000 64 5.5 

4 10011 361 30.9 
Total 


1170 100.0 





among string creatures. In natural populations fitness is determined by a crea- 
ture's ability to survive predators, pestilence, and the other obstacles to adult- 
hood and subsequent reproduction. In our unabashedly artificial setting, the 
objective function is the final arbiter of the string-creature's life or death. 
| The reproduction operator may be implemented in algorithmic form in a 
number of ways. Perhaps the easiest is to create a biased roulette wheel where 
each current string in the population has a roulette wheel slot sized in proportion 
to its fitness. Suppose the sample population of four strings in the black box 
problem has objective or fitness function values f as shown in Table 1.1 (for now 
we accept these values as the output of some unknown and arbitrary black box— 
later we will examine a function and coding that generate these same values ). 
Summing the fitness over all four strings, we obtain a total of 1170. The 
percentage of population total fitness is also shown in the table. The correspond- 
ing weighted roulette wheel for this generation's reproduction is shown in Fig. 
1.7. To reproduce, we simply spin the weighted roulette wheel thus defined four 
times. For the example problem, string number 1 has a fitness value of 169, which 
represents 14.4 percent of the total fitness. As a result, string 1 is given 14.4 
percent of the biased roulette wheel, and each spin turns up string 1 with prob- 





FIGURE 1.7 Simple reproduction allocates offspring strings using a roulette 
wheel with slots sized according to fitness. The sample wheel is sized for the 
problem of Tables 1.1 and 1.2. | | 
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ability 0.144. Each time we require another offspring, a simple spin of the 
weighted roulette wheel yields the reproduction candidate. In this way, more 
highly fit strings have a higher number of offspring in the succeeding generation. 
Once a string has been selected for reproduction, an exact replica of the string 
is made. This string is then entered into a mating pool, a tentative new population, 
for further genetic operator action. 

After reproduction, simple crossover (Fig. 1.8) may proceed in two steps. 
First, members of the newly reproduced strings in the mating pool are mated at 
random. Second, each pair of strings undergoes crossing over as follows: an in- 

_ teger position k along the string is selected uniformly at random between 1 and 
the string length less one [1, / — 1]. Two new strings are created by swapping all 
characters between positions k + 1 and / inclusively. For example, consider 
strings A, and A, from our example initial population: 


A, 
А, 


Ediz 
110010 


Suppose in choosing a random number between 1 and 4, we obtain a & — 4 (as 
indicated by the separator symbol | ). The resulting crossover yields two new 
strings where the prime (') means the strings are part of the new generation: 


01100 
IL i1.0:0 1 


A', 
А”, 


= 
== 


BEFORE CROSSOVER AFTER CROSSOVER 


CROSSING SITE 





FIGURE 1.8 A schematic of simple crossover shows the alignment of two 
strings and the partial exchange of information, using a cross site chosen at 
random. 
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The mechanics of reproduction and crossover are surprisingly simple, involv- 
ing random number generation, string copies, and some partial string exchanges. 
Nonetheless, the combined emphasis of reproduction and the structured, though 
randomized, information exchange of crossover give genetic algorithms much of 
their power. At first this seems surprising. How can two such simple (and com- 
putationally trivial) operators result in anything useful, let alone a rapid and ro- 
bust search mechanism? Furthermore, doesn't it seem а little strange that chance 
should play such a fundamental role in a directed search process? We will ex- 
amine a partial answer to the first of these two questions in a moment, the answer 
to the second question was well recognized by the mathematician J. Hadamard 
(1949, р. 29): 


We shall see a little later that the possibility of imputing discovery to 
pure chance is already excluded.... On the contrary, that there is an 
intervention of chance but also a necessary work of unconsciousness, 
the latter implying and not contradicting the former. . . . Indeed, it is ob- 
vious that invention or discovery, be it in mathematics or anywhere else, 
takes place by combining ideas. u 


Hadamard suggests that even though discovery is not a result—cannot be a re- 
sult—of pure chance, it is almost certainly guided by directed serendipity. Fur- 
thermore, Hadamard hints that a. proper role for chance in a more humanlike 
discovery mechanism is to cause the juxtaposition of different notions. It is in- 
teresting that genetic algorithms adopt Hadamard's mix of direction and chance 
in a manner that efficiently builds new solutions from the best partial solutions 
of previous trials. 

To see this, consider a population of п strings (perhaps the four-string pop- 
ulation for the black box problem) over some appropriate alphabet, coded so 
that each is a complete idea or prescription for performing а particular task (in 
this case, each string is one complete switch-setting idea ). Substrings within each 
string (idea) contain various notions of what is important Ог relevant to the task. 
Viewed in this way, the population contains not just a sample of n ideas; rather, 
it contains a multitude of notions and rankings of those notions for task perfor- 
mance. Genetic algorithms ruthlessly exploit this wealth of information by (1) 
reproducing high-quality notions according to their performance and (2) cross- 
ing these notions with many other high-performance notions from other strings. 
Thus, the action of crossover with previous reproduction speculates on new ideas 
constructed from the high-performance building blocks (notions) of past trials. 


In passing, we note that despite the somewhat fuzzy definition of a notion, we. 


have not limited a notion to simple linear combinations of single features or pairs 
of features. Biologists have long recognized that evolution must efficiently pro- 
cess the epistasis (positionwise nonlinearity) that arises in nature. In a similar 
manner, the notion processing of genetic algorithms must effectively process no- 
tions even when they depend upon their component features in highly nonlinear 
and complex ways. 
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Exchanging of notions to form new ideas is appealing intuitively, if we think 
in terms of the process of innovation. What is an innovative idea? As Hadamard 
suggests, most often it is a juxtaposition of things that have worked well in the 
past. In much the same way, reproduction and crossover combine to search po- 
tentially pregnant new ideas. This experience of emphasis and crossing is analo- 
gous to the human interaction many of us have observed at a trade show or 
scientific conference. At a widget conference, for example, various widget ex- 
perts from around the world gather to discuss the latest in widget technology. 
After the lecture sessions, they all pair off around the bar to exchange widget 
stories. Well-known widget experts, of course, are in greater demand and ex- 
change more ideas, thoughts, and notions with their lesser known widget col- 
leagues. When the show ends, the widget people return to their widget 
laboratories to try out a surfeit of widget innovations. The process of reproduc- 
tion and crossover in a genetic algorithm is this kind of exchange. High-perfor- 
mance notions are repeatedly tested and exchanged in the search for better and 
better performance. | 

If reproduction according to fitness combined with crossover gives genetic 


algorithms the bulk of their processing power, what then is the purpose of the 


mutation operator? Not surprisingly, there is much confusion about the role of 
mutation in genetics (both natural and artificial) Perhaps it is the result of too 
many B movies detailing the exploits of mutant eggplants that consume mass 
quantities of Tokyo or Chicago, but whatever the cause for the confusion, we find 
that mutation plays a decidedly secondary role in the operation of genetic algo- 
rithms. Mutation is needed because, even though reproduction and crossover 
effectively search and recombine extant notions, occasionally they may become 
overzealous and lose some potentially useful genetic material ( 155 or O's at partic- 


ular locations). In artificial genetic systems, the mutation operator protects. 


against such an irrecoverable loss. In the simple GA, mutation is the occasional 
(with small probability ) random alteration of the value of a string position. In the 
binary coding of the black box problem, this simply means changing a 1 to a 0 
and vice versa. By itself, mutation is a random walk through the string space. 
When used sparingly with reproduction and crossover, it is an insurance policy 
against premature loss of important notions. | 

That the mutation operator plays a secondary role in the simple GA, we sim- 
ply note that the frequency of mutation to obtain good results in empirical 
genetic algorithm studies is on the order of one mutation per thousand bit (po- 
sition ) transfers. Mutation rates are similarly small (or smaller) in natural popu- 
lations, leading us to conclude that mutation is appropriately considered as a 
secondary mechanism of genetic algorithm adaptation. 

Other genetic operators and reproductive plans have been abstracted from 
the study of biological example. However, the three examined in this section, 
reproduction, simple crossover, and mutation, have proved to be both computa- 
tionally simple and effective in attacking a number of important optimization 
problems. In the next section, we perform a hand simulation of the simple genetic 
algorithm to demonstrate both its mechanics and its power. 
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GENETIC ALGORITHMS AT WORK—A SIMULATION BY HAND 


Let’s apply our simple genetic algorithm to a particular optimization problem step 
by step. Consider the problem of maximizing the function f(x) = x?, where x is 
permitted to vary between 0 and 31, a function displayed earlier as Fig. 1.5. To 
use a genetic algorithm we must first code the decision variables of our problem 
as some finite-length string. For this problem, we will code the variable x simply 
as a binary unsigned integer of length 5. Before we proceed with the simulation, 
let's briefly review the notion of a binary integer. As decadigited creatures, we 
have little problem handling base 10 integers and arithmetic. For example, the 
five-digit number 53,095 may be thought of as 


Бекор 34105 + OL --- 9-10" о = 25,995. 


In base 2 arithmetic, we of course only have two digits to work with, 0 and 1, 
and as an example the number 10,011 decodes to the base 10 number 


1-2* + 0-25 + 0:22 + 1-2! + 1:29 = 16 + 2 + 1 = 19. 


With a five-bit (binary digit) unsigned integer we can obtain numbers between 
О (00000) and 31 (11111). With a well-defined objective function and coding, 
we now simulate a single generation of a genetic algorithm with reproduction, 
crossover, and mutation. 

To start off, we select an initial population at random. We select a population 
of size 4 by tossing a fair coin 20 times. We сай skip this step by using the initial 
population created in this way earlier for the black box switching problem. Look- 
ing at this population, shown on the left-hand side of Table 1.2, we observe that 
the decoded x values are presented along with the fitness or objective function 
values f(x). To make sure we know how the fitness values f(x) are calculated 
from the string representation, let’s take a look at the third string of the initial 
population, string 01000. Decoding this string as an unsigned binary integer, we 
note that there is a single one in the 2* = 8° position. Hence for string 01000 
we obtain x — 8. To calculate the fitness or objective function we simply square 
the x value and obtain the resulting fitness value f(x) — 64. Other x and f(x) 
values may be obtained similarly. 

You may notice that the fitness or objective function values are the same as 
the black box values (compare Tables 1.1 and 1.2). This is no coincidence, and 
the black box optimization problem was well represented by the particular func- 
tion, f(x), and coding we are now using. Of course, the genetic algorithm need 
not know any of this; it is just as happy to optimize some arbitrary switching 
function (or any other finite coding and function for that matter) as some poly- 
nomial function with straightforward binary coding. This discussion simply rein- 
forces one of the strengths of the genetic algorithm: by exploiting similarities in 
codings, genetic algorithms can deal effectively with a broader class of functions 
than can many other procedures. 

A generation of the genetic algorithm begins with reproduction. We select 
the mating pool of the next generation by spinning the weighted roulette wheel 
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TABLE 1.2 A Genetic Algorithm by Hand 





u Actual 
Initial Expected Count 

String · Randomly Unsigned fix) fi fi Roulette 

No. Generated Integer ae >f y Wheel 

1 О 1 101 13 169 0.14 0.58 1 

2 11000 24 576 0.49 1.97 2 

3 ek. 000,0 8 64 0.06 0.22 0 

4 1 D D 1.1 19 361 0.31 1.23 1 

Sum 1170 1.00 4.00 4.0 

Average 293 0.25 1.00 1.0 

Max 576 049 1.97 2.0 





(shown in Fig. 1.7) four times. Actual simulation of this process using coin tosses 
has resulted in string 1 and string 4 receiving one copy in the mating pool, string 
2 receiving two copies, and string 3 receiving no copies, as shown in the center 
of Table 1.2. Comparing this with the expected number of copies ( п-рзејест,) we 
have obtained what we should expect: the best get more copies, the average stay 
even, and the worst die off. 

With an active pool of strings looking for mates, simple crossover proceeds 
in two steps: (1) strings are mated randomly, using coin tosses to pair off the 
happy couples, and (2) mated string couples cross over, using coin tosses to 
select the crossing sites. Referring again to Table 1.2, random choice of mates has 
selected the second string in the mating pool to be mated with the first. With a 
crossing site of 4, the two strings 01101 and 11000 cross and yield two new 
strings 01100 and 11001. The remaining two strings in the mating pool are 
crossed at site 2; the resulting strings may be checked in the table. | 

The last operator, mutation, is performed on a bit-by-bit basis. We assume 
that the probability of mutation in this test is 0.001. With 20 transferred bit po- 
sitions we should expect 20-0.001 = 0.02 bits to undergo mutation during a 
given generation. Simulation of this process indicates that no bits undergo mu- . 
tation for this probability value. As a result, no bit positions are changed from 0 
to 1 or vice versa during this generation. 

Following reproduction, crossover, and mutation, the new population is 
ready to be tested. To do this, we simply decode the new strings created by the 
simple genetic algorithm and calculate the fitness function values from the x 
values thus decoded. The results of a single generation of the simulation are 
shown at the right of Table 1.2. While drawing concrete conclusions from a single 
trial of a stochastic process is, at best, a risky business, we start to see how genetic 
algorithms combine high-performance notions to achieve better performance. In 
the table, note how both the maximal and average performance have improved 
in the new population. The population average fitness has improved from 293 to 
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TABLE 1.2 (Continued) 


Reproduction исен famed New x fix) 


(Cross Site Shown ) Selected Selected Population Value y 

o T+ 1943 2 4 01100 12 144 

1100|0 1 4 Ll. D 1 25 625 

1 110 0.0 4 2 J 4 B: ll 27 729 

+ ойе 11 3 2 10000 16 256 

Ий D айт DLANA eek auesdi fac were ee LOL e 

1754 

439 

729 

| | = 
NOTES: 


1) Initial population chosen by four repetitions of five coin tosses where heads = 1, tails = 0. 


2) Reproduction performed through 1 part in 8 simulation of roulette wheel selection (three 
coin tosses ). 


3) Crossover performed through binary decoding of 2 coin tosses (TT — 00, — О = cross site 
1. HH = 11, = 3 = cross site 4). 


4) Crossover probability assumed to be unity p, = 1.0. 


5) Mutation probability assumed to be 0.001, p,, = 0.001, Expected mutations = 5:4-0.001 = 
0.02. No mutations expected during a single generation. None simulated. 


439 in one generation. The maximum fitness has increased from 576 to 729 dur- 
ing that same period. Although random processes help cause these happy circum- 
stances. we start to see that this improvement is no fluke. The best string of the 
first generation (11000) receives two copies because of its high, above-average 
performance. When this combines at random with the next highest string 
(10011) and is crossed at location 2 (again at random ), one of the resulting 
strings ( 11011) proves to be a very good choice indeed. 

This event is an excellent illustration of the ideas and notions analogy devel- 
oped in the previous section. In this case, the resulting good idea is the combi- 
nation of two above-average notions, namely the substrings 11——— and —— —11. 
Although the argument is still somewhat heuristic, we start to See how genetic 
algorithms effect a robust search. In the next section, we expand our understand- 
ing of these concepts by analyzing genetic algorithms in terms of schemata or 
similarity templates. | 

The intuitive viewpoint developed thus far has much appeal. We have com- 
pared the genetic algorithm with certain human search processes commonly 
called innovative or creative. Furthermore, hand simulation of the simple genetic 
algorithm has given us some confidence that indeed something interesting is 
going on here. Yet, something is missing. What is being processed by genetic 
algorithms and how do we know whether processing it (whatever it is) will lead 
to optimal or near optimal results in a particular problem? Clearly, as scientists, 
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engineers, and business managers we need to understand the what and the how 
of genetic algorithm performance. 

To obtain this understanding, we examine the raw data available for any 
search procedure and discover that we can search more effectively if we exploit 
important similarities in the coding we use. This leads us to develop the impor- 
tant notion of a similarity template,-or schema. This in turn leads us to a key- 
stone of the genetic algorithm approach, the building block bypotbesis. 


GRIST FOR THE SEARCH MILL—IMPORTANT SIMILARITIES 


For much too long we have ignored a fundamental question. In a search process 
given only payoff data (fitness values), what information is contained in a popu- 
lation of strings and their objective function values to help guide a directed 
search for improvement? To ask this question more clearly. consider the strings 
and fitness values originally displayed in Table 1.1 from the simulation of the 
previous section (the black box problem) and gathered below for convenience: 


String Fitness | 
01101 169 
11000 576 
01000 64 
10011 561 





What information is contained in this population to guide a directed search for 
improvement? On the face of it. there is not very much: four independent samples 


of different strings with their fitness values. As we stare at the page, however, 


quite naturally we start scanning up and down the string column, and we notice 
certain similarities among the strings. Exploring these similarities in more depth, 
we notice that certain string patterns seem highly associated with good perfor- 
mance. The longer we stare at the strings and their fitness values. the greater is 


the temptation to experiment with these high fitness associations. It seems per- 


fectly reasonable to play mix and match with some of the substrings that are 
highly correlated with past success. For example, in the sample population, the 
strings starting with a 1 seem to be among the best. Might this be an important 
ingredient in optimizing this function? Certainlv with our function (f(x) = x^) 
and our coding (a five-bit unsigned integer) we know it is (why is this true?). 
But, what are we doing here? Really, two separate things. First, we are seeking 
similarities among strings in the population. Second, we are looking for causal 
relationships between these similarities and high fitness. In so doing, we admit a 
wealth of new information to help guide a search. To see how much and precisely 
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what information we admit, let us consider the important concept of a schema 
(plural, schemata), or similarity template. 


SIMILARITY TEMPLATES (SCHEMATA) 


In some sense we are no longer interested in strings as strings alone. Since im- 
portant similarities among highly fit strings can help guide a search, we question 
how one string can be similar to its fellow strings. Specifically we ask, in what 
ways is a string a representative of other string classes with similarities at certain 
string positions? The framework of schemata provides the tool to answer these 
questions. | 

A schema (Holland, 1968, 1975) is a similarity template describing a subset 
of strings with similarities at certain string positions. For this discussion, let us 
once again limit ourselves without loss of generality to the binary alphabet {0,1}. 
We motivate a schema most easily by appending a special symbol to this alphabet; 
we add the * or don't care symbol. With this extended alphabet we can now 
create strings (schemata) over the ternary alphabet 10, 1, *} and the meaning of 
the schema is clear if we think of it as a pattern matching device: a schema 
matches a particular string if at every location in the schema a 1 matches a 1 in 
the string, a 0 matches a 0, or a * matches either. As an example, consider the 
strings and schemata of length 5. The schema *0000 matches two strings, namely 
(10000, 00000). As another example, the schema *111* describes a subset with 
four members (01110, 01111, 11110, 11111]. As one last example, the schema 
0*1** matches any of the eight strings of length 5 that begin with a 0 and have a 
1 in the third position. As you can start to see, the idea of a schema gives us a 
powerful and compact way to talk about all the well-defined similarities among 
finite-length strings over a finite alphabet. We should emphasize that the * is only 
a metasymbol (a symbol about other symbols); it is never explicitly processed 
by the genetic algorithm. It is simply a notational device that allows description 
of all possible similarities among strings of a particular length and alphabet. 

Counting the total number of possible schemata is an enlightening exercise. 
In the previous example, with / = 5, we note there are 3-3-3-3-3 = 3° = 243 
CERIS similarity templates because each of the five positions may be a 0, 1, 

* [n general, for alphabets of cardinality (number of alphabet characters) Р, 

e are (Ё + 1) schemata. At first blush, it appears that schemata are making 
the search more difficult. For an alphabet with & elements there are only (only?) 
k! different strings of length 4 Why consider the (Ё + 1)' schemata and enlarge 
the space of concern? Put another way, the length 5 example now has only = 
32 different alternative strings. Why make matters more difficult by considering 
35 = 243 schemata? In fact, the reasoning discussed in the previous section makes 
things easier. Do you recall glancing up and down the list of four strings and 
fitness values and trying to figure out what to do next? We recognized that if we 
considered the strings separately, then we only had four pieces of information; 
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however, when we considered the strings, their fitness values, and the similarities 
among the strings in the population, we admitted a wealth of new information to 
help direct our search. How much information do we admit by considering the 
similarities? The answer to this question is related to the number of unique sche- 
mata contained in the population. To count this quantity exactly requires knowl- 
edge of the strings in a particular population. To get a bound on the number of 
schemata in a particular population, we first count the number of schemata con- 
tained in an individual string, and then we get an upper bound on the total num- 
ber of schemata in the population. 

To see this, consider a single string of length 5: 11111, for example. This 
string is a member of 2° schemata because each position may take on its actual 
value or a don't care symbol. In general, a particular string contains 2' schemata. 
As a result, a population of size n contains somewhere between 2! and n-2' sche- 
mata, depending upon the population diversity. This fact verifies our earlier in- 
tuition. The original motivation for considering important similarities was to get 
more information to help guide our search. The counting argument shows that a 
wealth of information about important similarities is indeed contained in even 
moderately sized populations. We will examine how genetic algorithms effec- 
tively exploit this information. At this juncture, some parallel processing appears 
to be needed if we are to make use of all this information in a timely fashion. 

These counting arguments are well and good, but where does this all lead? 
More pointedly, of the 2! to n-2' schemata contained in a population, how many 
are actually processed in a useful manner by the genetic algorithm? To obtain the 
answer to this question, we consider the effect of reproduction, crossover, and 
mutation on the growth or decay of important schemata from generation to gen- 
eration. The effect of reproduction on a particular schema is easy to determine; 
since more highly fit strings have higher probabilities of selection, on average we 
give an ever increasing number of samples to the observed best similarity pat- 
terns (this is a good thing to do, as is shown in the next chapter); however, 
reproduction alone samples no new points in the space. What then happens to a 
particular schema when crossover is inttoduced? Crossover leaves a schema un- 
scathed if it does not cut the schema, but it may disrupt a schema when it does. 
For example, consider the two schemata 1***0 and **11*. The first is likely to be 
disrupted by crossover, whereas the second is relatively unlikely to be destroyed. 
As a result, schemata of short defining length are left alone by crossover and 
reproduced at a good sampling rate by reproduction operator. Mutation at nor- 
mal, low rates does not disrupt a particular schema very frequently and we are 
left with a startling conclusion. Highly fit, short-defining-length schemata (we call 
them building blocks) are propagated generation to generation by giving expo- 
nentially increasing samples to the observed best; all this goes in parallel with no 
special bookkeeping or special memory other.than our population of 7 strings. 
In the next chapter we will count how many schemata are processed usefully in 
each generation. It turns out that the number is something like n*. This compares 
favorably with the number of function evaluations (7). Because this processing 
leverage is so important (and apparently unique to genetic algorithms), we give 
it a special name, implicit parallelism. 
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LEARNING THE LINGO 


The power behind the simple operations of our genetic algorithm is at least in- 
tuitively clearer if we think of building blocks. Some questions remain: How do 
we know that building blocks lead to improvement? Why is it a near optimal 
strategy to give exponentially increasing samples to the best? How can we cal- 
culate the number of schemata usefully processed by the genetic algorithm? 
These questions are answered fully in the next chapter, but first we need to mas- 
ter the terminology used by researchers who work with genetic algorithms. Be- 
cause genetic algorithms are rooted in both natural genetics and computer 
science, the terminology used in the GA literature is an unholy mix of the natural 
and the artificial. Until now we have focused on the artificial side of the genetic 
algorithm’s ancestry and talked about strings, alphabets, string positions, and the 
like. We review the correspondence between these terms and their natural coun- 
terparts to connect with the growing GA literature and also to permit our own 
occasional slip of a natural utterance or two. | 

Roughly speaking, the strings of artificial genetic systems are analogous to 
chromosomes in biological systems. In natural systems, one or more chromo- 
somes combine to form the total genetic prescription for the construction and 
operation of some organism. In natural systems the total genetic package is called 
the genotype. In artificial genetic systems the total package of strings is called a 
structure (in the early chapters of this book, the structure will consist of a single 
string, so the text refers to strings and structures interchangeably until it is nec- 
essary to differentiate between them). In natural systems, the organism formed 
by the interaction of the total genetic package with its environment is called the 
phenotype. In artificial genetic systems, the structures decode to form a partic- 
ular parameter set, solution alternative, or point (in the solution space). The 
designer of an artificial genetic system has a variety of alternatives for coding 
both numeric and nonnumeric parameters. We will confront codings and coding 
principles in later chapters; for now, we stick to our consideration of GA and 
natural terminology. 

In natural terminology, we say that chromosomes are composed of genes, 
which may take on some number of values called alleles. In genetics, the position 
of a gene (its locus) is identified separately from the gene's function. Thus, we 
can talk of a particular gene, for example an animal's eye color gene, its locus, 
position 10, and its allele value, blue eyes. In artificial genetic search we say that 
strings are composed of features or detectors, which take on different values. 
Features may be located at different positions on the string. The correspondence 
between natural and artificial terminology is summarized in Table 1.3. 

Thus far, we have not distinguished between a gene (a particular character) 
and its locus (its position); the position of a bit in a string has determined its 
meaning (how it decodes) uniformly throughout a population and throughout 
time. For example, the string 10000 is decoded as a binary unsigned integer 16 
(base 10) because implicitly the 1 is in the 16's place. It is not necessary to limit 
codings like this, however. A later chapter presents more advanced structures 
that treat locus and gene separately. 
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TABLE 1.3 Comparison of Natural and GA Terminology 





Natural Genetic Algorithm 

chromosome string 

gene feature, character, or detector 

allele feature value 

locus string position 

genotype structure 

phenotype parameter set, alternative solution, 
a decoded structure 

epistasis nonlinearity 

SUMMARY 


This chapter has laid the foundation for understanding genetic algorithms, 
their mechanics and their power. We are led to these methods by our search for 
robustness; natural systems are robust—efficient and efficacious—as they adapt 
to a wide variety of environments. By abstracting nature's adaptation algorithm 
of choice in artificial form we hope to achieve similar breadth of performance. 
In fact, genetic algorithms have demonstrated their capability in a number of 
analytical and empirical studies. | 

The chapter has presented the detailed mechanics of a simple, three-operator 
genetic algorithm. Genetic algorithms operate on populations of strings, with the 
string coded to represent some underlying parameter set. Reproduction, cross- 
over, and mutation are applied to successive string populations to create new 
string populations. These operators are simplicity itself, involving nothing more 
complex than random number generation, string copying, and partial string ex- 
changing; yet, despite their simplicity, the resulting search performance is wide- 
ranging and impressive. Genetic algorithms realize an innovative notion exchange ` 
among strings and thus connect to our own ideas of human search or discovery. 
A simulation of one generation of the simple genetic algorithm has helped illus- 
trate both the detail and the power of the method. 


Four differences separate genetic algorithms from more conventional opti- 
mization techniques: 


1. Direct manipulation of a coding 

2. Search from a population, not a single point 

3. Search via sampling, a blind search 

4. Search using stochastic operators, not deterministic rules 


Genetic algorithms manipulate decision or control variable representations 
at the string level to exploit similarities among high-performance strings. Other 
methods usually deal with functions and their control variables directly. Because 
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genetic algorithms operate at the coding level, thev are difficult to fool even when 
the function may be difficult for traditional schemes. 

Genetic algorithms work from a population; many other methods work from 
a single point. In this way, GAs find safety in numbers. By maintaining a population 
of well-adapted sample points, the probability of reaching a false peak is reduced. 

Genetic algorithms achieve much of their breadth by ignoring information 
except that concerning payoff. Other methods rely heavily on such information, 
and in problems where the necessary information is not available or difficult to 
obtain, these other techniques break down. GAs remain general by exploiting 
information available in any search problem. Genetic algorithms process similar- 
ities in the underlying coding together with information ranking the structures 
according to their survival capability in the current environment. By exploiting 
such widely available information, GAs may be applied in virtually any problem. 

The transition rules of genetic algorithms are stochastic; many other methods 
have deterministic transition rules. A distinction exists, however, between the 
randomized operators of genetic algorithms and other methods that are simple 
random walks. Genetic algorithms use random choice to guide a highly exploi- 
tative search. This may seem unusual, using chance to achieve directed results 
(the best points), but nature is full of precedent. 

We have started a more rigorous appraisal of genetic algorithm performance 
through the concept of schemata or similarity templates. A schema is a string 
over an extended alphabet, {0,1,*} where the 0 and the 1 retain their normal 
meaning and the * is a wild card or don't care symbol. This notational device 
greatly simplifies the analysis of the genetic algorithm method because it explic- 
itly recognizes all the possible similarities in a population of strings. We have 
discussed how building blocks—short, high-performance schemata—are com- 
bined to form strings with expected higher performance. This occurs because 
building blocks are sampled at near optimal rates and recombined via crossover. 
Mutation has little effect on these building blocks; like an insurance policy, it 
helps prevent the irrecoverable loss of potentially important genetic material. 

The simple genetic algorithm studied in this chapter has much to recom- 
mend it. In the next chapter, we will analyze its operation more carefully. Follow- 
ing this, we will implement the simple GA in a short computer program and 
examine some applications in practical problems. 


E PROBLEMS 


1.1. Consider a black box containing eight multiple-position switches. Switches 
1 and 2 may be set in any of 16 positions. Switches 3, 4, and 5 are four-position 
switches, and switches 6-8 have only two positions. Calculate the number of 
unique switch settings possible for this black box device. 


1.2. For the black box device of Problem 1.1, design a natural string coding that 
uses eight positions, one position for each switch. Count the number of switch 
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И ith its growing 


ease of use and bur- 
eeoning popularity. the 
Internet is fast becoming 
the all-purpose informa- 
tion superhighway we ve 
been hearing so many 
promises about. But can 
it survive the transition 
from a government pro- 
tectorate to a free- 


market medium? 
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ji 5 WITH ANY ROMANCE, the world's present infatuation with the 


P Internet has emphasized the magic and tended to ignore the practical. Couples 
falling in love dote on each other's wonder traits. It is only a little later that they float 
down from the clouds and tackle the economics—how Lig a house they can afford, 
what kind of income they will need.. О The Internet, for most people, is still a 
new love. Ignore for a moment the few tens of thousands of people who inhabited 
ae Net before the phenomenal population boom began in the late 1980s. For the 
rest of us, the Internet is still exciting and nota little bit mysterious. О Витте 
oddity is that nobody seems to be paying for all the informational goodies that can 
pour into our computers like water from a broken pipe. You might pay a few dollars 
a month for the privilege of being connected, but once you slide past the electronic 
turnstile, it’s an all-you-can-consume buffet of bits, from plain vanilla e-mail to par- 
ticipation in online discussion groups to ies of Library of Congress databases to 
the latest satellite images of the planet. Want to send a 10-page letter to a friend in Aus- 
tralia? Go ahead—no extra charge. Want a nifty piece of software that lets you browse 
the Net by pointing and clicking? Take it, it’s free. Want to mail a fund-raising appeal 
to 10,000 people? The Internet converts this from a $3,200 postal endeavor into one 


that’s more or less on the house. Internet users seem to have found a kind of surreal 
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restaurant where they can order a bottomless cup of cof- 
fee or a lobster dinner for 100 friends and no one ever 
presents an itemized bill. 

Part of the reason is that, at least until the last few 

"ars, most members of the Internet Nation plugged in 

-ough computers at their workplace or university, so 
costs were a kind of invisible overhead that someone else 
worried about. At MIT, for example, a high-speed fiber- 
optic network called MITnet links com- 
puters all over campus. One of these 
computers serves as a “gateway” that 
connects MITnet to one of 17 regional 
networks, in this case called Nearnet and 
operated by Bolt Beranek and Newman, 
a Cambridge-based technology consult- 
ing company. Nearnet, in turn, is con- 
nected to a “backbone” known as 
NSFNet because it is funded by the 
National Science Foundation. NSFNet 
itself is operated by ANS, a not-for-profit 
company that has leased high-capacity 
fiber-optic telephone lines from the same 
companies that handle long-distance 
telephone traffic—AT& T, MCI, and 
Sprint. Each organization pays a flat rate 
to the broader system it taps into; indi- 
vidual users are essentially insulated 
from cost burdens regardless of the vol- 
me of their use. 

With similar hierarchical connec- 
tions, commercial on-line services such 
as Prodigy and America Online give indi- 
vidual subscribers e-mail privileges on 
the Internet as well as access to some of 
its more popular resources, such as the 
Usenet newsgroups (online bulletin 
boards on hundreds of topics). These 
commercial services are heavily promot- 
ing such connections, particularly the 
ability of subscribers to tap into the 
World Wide Web, an interwoven collec- 
tion of Internet resources that allows 
point-and-click navigation without mas- 
tery of arcane commands. But although 
this access brings entrance into an elec- 
tronic universe where interactivity is not 
just a marketing slogan but a way of life, 
the cost is usually just a fraction of what users pay for 
cable television. 

That situation may change as the Internet detaches 
from the government umbilical cord that has nurtured it 
through its infancy. Beginning April 30 of this year, the 
NSF will no longer pay to operate the backbone net- 
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М of people 
enjoy low-cost access 
to the In tern et’s infor- 
mation cornucopia. 
Now, without the 
federal subsidies that 
have built and sustained 
it, the Net will have to 
make it on its own— 
forcing decisions on 
whether, and how much, 


users might pay. 


work. The portion of NSF funding that goes to the 17 
regionals is also now on a five-year “sunset schedule,” 
dropping gradually to zero by fiscal 1998. 

Under the new arrangement, the federal govern- 
ment will grant this dwindling amount of money 
directly to each regional network and instruct it to shop 
for backbone service on the private market. The transi- 
tion resembles, in one sense, the breakup of the Bell Sys- 
tem a decade ago. But the govern- 
ment is not stepping out of the 
Internet picture altogether. NSF is 
setting up and funding three “net- 
work access points," or NAPs; any 
company that wants to operate an 
Internet backbone must connect to 
each of these three NAPs, which are 
to be located in New Jersey, 
Chicago, and California. To qualify 
as a backbone service provider, a 
company must agree to accept 
Internet transmissions that arrive at 
each NAP from every other back- 
bone company. 

The withdrawal of a large part 
of government support will not by 
itself significantly raise prices for 
users. NSF's total funding for the 
Internet is only about $20 million a 
year. The companies, universities, 
and individuals that use the Net pay 
many times that amount, and divid- 
ing that $20 million over the num- 
ber of present users yields only 
about $1 per person per year. As the 
Net grows in popularity, that bur- 
den may diminish further. What 
worries some analysts, however, is 
that the nature of information being 
sent over the Internet is changing 
rapidly, with potential implications 
for the system's cost and ease of 
access. 


DIGITAL CONGESTION 


ntil about two years ago, the 

U overwhelming majority of 

Net users were transmitting 

simple text such as e-mail messages and Usenet postings. 
Text is a highly efficient method of communication: the 
words composing a page of the Encyclopedia Brittanica, 
for instance, can be encoded in standard ASCII form 
using fewer than 10 kilobytes. But new software and 
more powerful desktop computers have made it practi- 
cal to send high-resolution color images, sound files, 
even full-motion video—anyone with a camcorder and 
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а multimedia computer can conduct a videoconference, 
for example, over the Net. Such uses consume orders 
of magnitude more capacity, ог“ bandwidth,” than text. 
By swamping the network with video signals, relatively 
few users can temporarily overload portions of the Net. 
The World Wide Web has the potential to exacer- 
bate this problem, since Web browsers can easily, almost 
inadvertently, trigger the transmission of huge amounts 
of data. Wave a mouse 
around the screen, click 
once on an appealing 
picture, and megabytes 
start flowing. Without 
the Web, users must 
type in a command to 
retrieve that informa- 
tion—a step that can at 
least give them pause. 

· То understand both 
the Net’s capacity to 
transmit such informa- 
tion and its vulnerabil- 
ity to overload, com- 
pare it with the familiar 
telephone system. That 
network uses a tech- 
nique known as circuit PSA V 
switching: when you SOS к= 
dial your grandmother’s is 
number, you are in- 
structing the system's 
switches to establish 
a connection between 
your telephone and 
hers. This connection 15 
maintained for the 
duration of the call: as 
long as you are on the 
line, no one else can 
use this circuit. You are consuming a scarce resource 
and pay for the privilege. | | 

Although most Internet traffic physically flows 
through the telephone wires, information 15 packaged 
and routed much differently. Each transmission is bro- 
ken up into discrete “packets” containing roughly 200 
bytes (packet size varies). Each packet is stamped with 
the recipient’s address. The packets then bounce from 


of 
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computer to computer along the Net, each computer | 
examining the address and deciding where to send. 


them next for the most efficient transmission. Since 
these decisions depend on conditions at the moment, 
the packets may travel different routes to reach the 
same destination. Eventually all the packets arrive at 
the receiving computer, which reassembles them into 
the original form. 

This structure—or architecture, as computer scien- 





tists like to call it—stems in part from the Internet's ori- 
gins as a defense project. A packet-switching system 15 
difficult to eavesdrop on, since messages are scattered to 
the electronic winds before finally coalescing at the 
receiving point. The design also lowers the risk that a 
military attack would disrupt communications—a pri- 
mary concern in the 1950s and '60s when ARPAnet, 
the Internet’s ancestor, was designed by the Pentagon’s 
| Advanced Research 
Projects Agency. The 
reasoning was brutally 
straightforward: if an 
enemy attack were to 
knock out the Washing- 
ton-to-New York con- 
nection, say, informa- 
tion would still move 
between these two 
cities, albeit in a round- 
about manner. A cir- 
cuit-switched net 
work like the tele- 
phone system offers 
only limited flexibility 
in this regard because a 
circuit must be estab- 
lished before commu- 
nication begins; a 
packet-switched net- 
work can dynamic- 
ally “heal” itself in mid- 
transmission. 
Although moti- 
vated initially by secu- 
rity concerns, packet- 
switching technology 
has profound implica- 
tions for the economics 
of network communi- 
cations. When someone sends something over the Inter- 
net—say, a piece of e-mail—the packets do not con- 
sume a scarce resource in the same way that a phone call 
does. If a router is busy, incoming packets simply queue 
up and wait their turn. Longer lines translate into 
delays, not busy signals. For the uses of the Internet 
that have prevailed so far, such lags don't make much 
difference. Unlike telephone conversations, which take 
place in real time, e-mail communications can easily tol- 
erate delays of many seconds or even minutes. However, 
the advent of multimedia services on the Internet 1s 
making delays less tolerable. If packets queue up at a 
router, quality of service can deteriorate; video appears 
jumpy, for example, and moving from one World Wide 
Web link to another can take so long that the medium 
becomes more an annoyance than an adventure. 
It is possible that advances in technology will pro- 
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vide the needed capacity. The best fibers used in the 
long-distance links that carry both voice and data traf- 
fic can accommodate 2.4 gigabits (billion bits) per sec- 
ond. Over the next two years, upgrades to optical trans- 
mitters and receivers will quadruple that data rate. Even 
better, after more than a decade of development, 
telecommunications engineers are perfecting a system 
called wavelength division multiplexing, which enables 
a single fiber to carry multiple channels of information, 
each encoded on laser light of a slightly differing wave- 
length. Such multiplexed systems will be in place by 
1998, yielding a capacity of 40 gigabits per second, pre- 





dicts Vinton Cerf, a senior vice-president at МСТ data 
services division and president of the Internet Society, a 
nonprofit organization that promotes Internet usage 
and standardization. These radical leaps in performance 
are coming as costs of all component technologies— 
fibers, lasers, and electronics—decline. Technology, in 
other words, has the capacity to abolish near-term 
bandwidth scarcity. j 

Still, if the recent past is any guide, the demand 
for bandwidth will grow at least as fast as the supply. 
According to University of Michigan economist Hal 
Varian, while NSFnet now operates at only 5 percent 
of capacity, the volume of packet flow is rising by 6 
percent per month. At that rate, average traffic volume 
will reach 20 percent of capacity in only two years. 
During times of peak use, however, the amount of 
information put onto the Net far exceeds this 20 per- 
cent average, and packets that don’t “fit” have to wait 
until a channel is clear. As the Internet is used more 
and more for real-time forms of communication such 
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as videoconferencing, such delays become intolerable. 
Congestion on the Internet is already hampering 
attempts to use it for new applications during peak busi- 
ness hours, says Jeffrey Mackey-Mason, an economist 
at the University of Michigan. The problem becomes 
particularly acute when some special event occurs. After 
the comet Shoemaker-Levy struck Jupiter, for example, 
and people downloaded the dramatic telescope images, 
large portions of the Internet slowed down. In such sit- 
uations, urgent transmissions, such as a potentially life- 
saving videoconference between a surgeon and a radi- 
ologist, might queue up behind a home movie that 
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someone put on the Net just for fun. In effect, the Net 
can be dominated by people with a lot of time on their 


hands, and there is no provision for buying one’s way 
to the front of the line. . 


То CHARGE OR NOT TO CHARGE 


ome analysts therefore contend that the nation 
needs some kind of disincentive to unbridled con- 
sumption of the Net’s capacity. If people have to 
pay for what they do, they will tend to do less, says Pad- 
manabhan Srinagesh, an engineer at Bell Communica- 
tions Laboratories, or Bellcore. Net users, in other 
words, might have to say goodbye to the freedom of a 
flat rate. The Internet would instead be metered, with 
users paying by the message, by the byte, or by the Web 
page, just as they now pay by the kilowatt-hour for elec- 
tricity or by the minute for long-distance phone calls. 
Philip Gross, vice-president for Internet engineering 
at MCIS data services division, agrees that some sort of 
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usage-based pricing is “inevitable,” if only so the mul- 
tiple companies handling backbone traffic can count 
packets and settle accounts with one another. But the 
solution to the problem is not as simple as counting 
packets or bytes. Take a typical transaction: the trans- 
fer of a large software file. User A sends a 100-byte 
request to User B, who responds by transmitting a 1- 
megabyte program. A naive billing 
system would charge User B 10,000 
times more than User A, even though 
User A initiated the transaction and 
received all of its benefits. 

Opponents of fees argue that they 
would rob the Net’s vitality as a 
workplace, playground, and social 
club. The electronic culture that extols 


| nformation sent 


through the Internet 


exactly “where” they are traveling. “If you try to charge 
based on distance or on the number of bits, then the 
Internet falls apart,” says Edward Krol of the University 
of Illinois, author of The Whole Internet User’s Guide 
С" Catalog. “If the best resource happens to be in Bel- 
gium, you just use it.” 

Fortunately, the present flat-rate pricing does have 
some built-in protection against net- 
work congestion. As it now stands, 
the cost of Net access varies with the 
bandwidth of the basic connection. A 
user operating from a home or office 
with a 9,600-bit-per-second link pays 
less than a business with a 56-kilobit- 
per-second connection. The size of this 
“pipe” limits usage just the way the 


instant and perpetual electronic con- is broken into small size of a house’s electrical service limits 
nectedness presupposes that people the amount of power it can draw off 
will not have to fret over a ticking fare ' 4 the grid. Nevertheless, when millions 
meter in cyberspace. Usage pricing packets” that can of Net users raise their expectations 


might especially dampen the enthusi- 
asm of school and home computer 
users. “We should be trying to pre- 
serve the motivation of people to use 
the Internet,” says David Wasley, 
director of computer network services 
at the University of California at 
Berkeley. The recent introduction of 
usage fees in Australia, Wasley says, 
led to a rapid decline in e-mail sent by 
students. 

Not-for-profit organizations that 
have begun to rely heavily on Inter- 


travel along a variety 
of independent paths 
to reach the same 


destination. This 


of what is possible on the network, 
“the Internet will melt,” maintains 
Scott Shenker, a researcher at Xerox’s 
Palo Alto Research Center. 

One way to avoid any potential cri- 
sis is to outfit the Net with mecha- 
nisms that perform electronic triage. 
Usage that cannot tolerate delays, 
such as real-time videoconferencing, 
would be stamped high priority and 
sail through the Net like an ambu- 
lance with siren wailing. Users would 
pay their Internet provider a premium 


net mailing lists to reach activists and : for this special treatment. Shenker 
potential donors are particularly vul- architecture makes points out that such priority-based 
nerable to pricing changes. Any kind preferences would lead logically to 
of per-use charge will have a chilling У usage-based pricing, since people 
effect on um of these peti exer- the Internet efficient — — mes about срне 
cises in “electronic democracy,” real-time videoconferences 1f they 
asserts James Love, president of the to pay dearly for the privilege of dis- 
Washington-based Taxpayers’ Assets but still vulnerable placing so much other activity. Most 


Project, a group that monitors the 
outcome of privatization efforts. “Say 
you send a message a day to everyone 
on a 10,000-name list,” he says. “If 
you have to pay per transaction, that 
adds up.” 

Other complications arise as well. Part of the Inter- 
net’s value—and charm—lies in its utter transcendence 
of geography. In today’s system, a Net surfer who 
downloads pictures of Jupiter need not care whether the 
computer holding these images is 10 miles or 10,000 
miles away. All information stored in any computer on 
the network is, in effect, stored everywhere. Because 
the Net has traditionally rendered geographical dis- 
tance a secondary concern, users often don’t even know 


to overload. 


users would routinely put a lowest- 
priority tag on e-mail and text post- 
ings to newsgroups, which rarely 
require rapid delivery, and the fees 
paid by senders who demand high pri- 
ority would subsidize any cost associ- 
ated with such transmission. 

Although companies that provide Net access have 
stuck mostly to flat-fee structures, they may start pro- 
moting congestion pricing as a marketing edge as the 
number of providers proliferates. Many users will pre- 
sumably appreciate the ability to pay for what amounts 
to a guarantee of immediate transmission. 

_ One barrier to such a move is that today’s network 
has no means of tracking packets sent and received. All 
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Internet communication is governed by a set of rules, 
or “protocols,” called TCP/IP ( transmission control 
protocol/ Internet protocol). TCP/IP specifies the 
method by which any digital data—whether they rep- 
resent text, graphics, or anything else—are transmitted, 
routed, checked for errors, and, if need be, resent. But 
nothing in the protocols provides the detailed informa- 
tion that commercial telecommunications companies 
need to provide a billing record, says MCPs Gross. 

Even if the protocols did gather such information, 
the accounting process is bound to be expensive. More 
than half of what customers pay for a telephone call 
goes to cover the cost of the accounting system, con- 
tends Wasley. Thus any attempt to bill for Internet use 
could become a case of self-fulfilling prophecy: the very 
act of collecting the necessary information could raise 
the network’s operating cost to the point that users will 
have to pay more. Anthony Rutkowski, executive direc- 
tor of the Internet Society and a former vice-president 
at Sprint Telecommunications, thinks *it won't be 
worth the trouble to account for users’ consumption of 
the network's capacity.” 

Whether and how to devise a workable billing sys- 
tem if and when usage-based pricing arrives is a decision 
that will have to be made cooperatively by the compa- 
nies that carry Internet traffic, along with the Internet 
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Engineering Task Force—an organization with repre- 
sentatives from government, research and educational 
institutions, and vendors from all over the world that 
writes technical standards for the Internet. 


A CHANGING PICTURE 


he next few years will be a time of shakeout in 
the burgeoning Internet business. Because the 
Internet operates as a loose cluster of networks, 
no one is really “in charge,” and each provider of local, 
regional, and backbone service is free to price Internet 
access any way it chooses—each, of course, influenced 


by the price charged by its access supplier. 


The most likely short-term scenario is for flat-rate 
service to continue as the norm. In fact, the prevailing 
assumption is that the smartest way to do business will 
be to maintain the status quo. Online society has mush- 
roomed on the basis of unlimited use, a structure that 
encourages a kind of freewheeling exploration. The 
companies that sell Internet access understand this 
appeal and seem loath to tamper with it. “We don’t 
want to kill the goose that laid the golden egg,” says 
MCI'5 Gross. He insists that MCI “will not unilaterally 
impose” usage-based pricing. "We're very content to 
operate in the current Internet mode” in the near term, he 
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says, and expects to make no major changes in the next 
year or two. Eric Aupperle, president of Merit Network, 
the regional network serving Michigan, echoes this sen- 
timent. “My sense is that the community wants flat-fee 
access," he says, *and that's how it's going to be." 

One compromise pricing method is for an Internet 
customer to declare at the outset 
whether it will be a high-volume or low- 
volume user. For a given bandwidth, the 
Internet service provider could charge 
the low-volume user less than a high- 
volume user. That way a small com- 
pany can get the benefit of an afford- 
able, high-capacity access. MCI is look- 
ing to put something like this in place 
*in the very short term," says Gross. 

The amount of change that end 
users will experience depends on the 
type of Internet service they have 
become accustomed to. Clients of Near- 
net in the Northeast and of Barnet in 
California will feel barely a ripple dur- 
ing this transition. Those networks have 
been accepting commercial business for 


1] не 


fees could defray 
operating costs and 
possibly relieve con- 


2estion and delays. 


Commerce on the Internet is still embryonic. A few 
companies publish online catalogs, but transactions are 
still consummated by telephone and credit card and the 
product arrives in a UPS truck. But it is in the market- 
ing of information-based products that the Internet’s 
business potential can be more fully tapped. Soft- 
ware.Net, for example, delivers com- 
puter programs through the Net. Other 
companies are similarly gearing up to 
distribute music and digitized art 
online, according to Bob O’Keefe, a 
professor at Rensselaer Polytechnic 
Institute’s School of Management. 
Commercial sponsorship also made 
possible “free” radio and television 
broadcast; following this model, adver- 
tising revenue could pay for on-line 
publications. Such advertising is poten- 
tially of more value than printed or 
broadcast ads to consumers, who can 
obtain precisely the product informa- 
tion they need with a few mouse clicks 
on an unobtrusive icon. 

Although such moves may solve 


years, and so have relied less on NSF. But the idea of a some pricing problems, the biggest hur- 
But in other areas of the country, the dles to Internet access for many people 
regional networks have continued to Man are not fees for service but the cost of 
depend heavily on NSF for funding. ticking fare meter a computer. Commercial activity on the 


There, prices to universities and other 
organizations that hook into the 
regional network will probably rise. 


INTO THE MARKETPLACE 


Net culture. 


s federal funding for the Inter- 

А. net winds down, following it 

. по extinction are the NSF’s 

*acceptable use policies," which have restricted use of 
the Net for profit-making activity. As the Internet is 
reborn as a medium not just of public discourse but of 
commercial opportunity, the revenues generated by 
such activity might well render moot the questions of 
per-use charges and ensure access by nonprofit groups. 
Just as retail stores pay rent to occupy a shopping mall, 
companies that make a profit on the Net will pay 
telecommunications companies to be there. Microsoft 
is heading in this general direction with its plan to 
offer Internet access through Windows 95, the long- 
awaited version of its popular Windows software for 






personal computers. Microsoft will obtain its revenue ` 


not from consumers, who will pay the company only a 
nominal fee for Internet access, but from businesses 
that set up shop on the Net. 


is anathema to 


Net could help here, 100. А5 а market- 
place, the Internet can be subject to a 
process as old as the Net is modern: 
taxation. One suggestion, from David 
Farber of the University of Pennsylva- 
nia, is to impose a 10 percent sales tax 
on business conducted over the Inter- 
net. This revenue could create a pool of 
funds to buy Net terminals that would 
be distributed widely at libraries and community cen- 
ters. Given the resistance to new taxes that now domi- 
nates the political landscape, however, such a scheme 
seems far-fetched. 

While equity of access is far from guaranteed in the 
near term, the public should in the long run benefit as 
the Internet is released from the simultaneously nurtur- 
ing and smothering federal sponsorship. Decades of 
government support have built a communications 
infrastructure that fosters experimentation. For the last 
eight or nine years, the NSF has been in the business of 
“market building," says NSFNet program officer David 
Staube—constructing an infrastructure and trying to 
persuade institutions and individuals to use it. That 
phase has passed. Now, he says, “the market can stand 
on its own—without our seed money.” W 
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COMPUTING PRACTICES 


Edgar H. Sibley 
Panel Editor 


An evaluation.of a large, operational full-text document-retrieval system 
(containing roughly 350,000 pages of text) shows the system to be reirieving 


less than 20 percent of the documents relevant to a particular search. The 
findings are discussea in terms of the theory and practice of full-text 


document retrieval. 


AN EVALUATION OF RETRIEVAL 
EFFECTIVENESS FOR A FULL-TEXT 
DOCUMENT-RETRIEVAL SYSTEM · 


DAVID С. BLAIR and M. Е. MARON 


Document retrieval is the problem of finding stored 
documents that contain useful information: There exist | 
à set of documents on a range of topics, written by 
different authors. at different times, and at varying 
levels of depth. detail. clarity, and precision, and a set 
of individuals who, at different times and for different 
reasons, search for recorded information that may be 
contained in some of the documents in this set. In each 
Instance in which an individual seeks information. he 
or she will find some documents of the set useful and 
other documents not useful: the documents found use- 
are, we say, relevant; the others. not relevant. | 
How should a collection of documents be organized 
$o that a person can find all and only the relevant 
Rems? One answer is automatic full-text retrieval. 
ich on its surface is disarmingly simple: Store the 
ll text of all documents in the collection on a com- 
Puter so that every character of every word in every 
| sentence of every document can be located by the ma- 
i ‘ne. Then. when a person wants information from 
t stored collection. the computer is instructed to 
search for all documents containing certain specified 


M word combinations. which the user has 


Two elements make the idea of automatic full-text 
‘ettieval even more attractive. On the one hand. digital 
k nology continues to provide computers that are 

"Ber, faster. cheaper. more reliable. and easier to use: 
D Оп the other hand, full-text retrieval avoids the 


1‹ | 
95 ACM 0001-0782 /85 /0300-0289 75€ 


Moret "m = m 


need for human indexers whose employment is in- 
creasingly costly and whose work often appears incon- 
sistent and less than fully effective. 

A pioneering test to evaluate the feasibility of full- 
text search and retrieval was conducted by Don Swan- 
son and reported in Science in 1960 [6]. Swanson con- 
cluded that text searching by computer was signifi- 
cantly better than conventional retrieval using human 
subject indexing. Ten years later. in 1970, Salton, also 
in Science, reported optimistically on a series of experi- 
ments on automatic full-text searching [3]. 

This paper describes a large-scale. full-text search 
and retrieval experiment aimed at evaluating the effec- 
tiveness of full-text retrieval. For the purposes of our 
study, we examined IBM's full-text retrieval system. 
STAIRS. STAIRS. an acronym for "STorage And Infor- 
mation Retrieval System.” is a very fast. large-capacity. 
full-text document-retrieval system. Our empirical 
study of STAIRS in a litigation support situation 
showed its retrieval effectiveness to be surprisingly 
poor. We offer theoretical reasons to explain why this 
poor performance should not be surprising and also 
why our experimental results are not inconsistent with 
the earlier more favorable results cited above. The re- 
trieval problems we describe would be problems with 
any large-scale. full-text retrieval system. and in this 
sense our study should not be seen as a critique of 
STAIRS alone, but rather a critique of the principles on 
which it anc other full-text document-retrieval systems 
are based. 
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P ALLURE OF FULL-TEXT 
С BEUMENT RETRIEVAL 

Retrieving document texts by subject content occupies 
a special place in the province of information retrieval 
because: unlike data retrieval. the richness and flexibil- 
ity of natural language have a significant impact on the 
conduct of a search. The indexer chooses subject terms 
that will describe the informational content of the doc- 
uments included in the database. and the user de- 
scribes his or her information need in terms of the 
subject descriptors actually assigned to the documents 
(Figure 1). However, there are no clear and precise 
rules to govern the indexers' choice of appropriate sub- 
ject terms. so that even trained indexers may be incon- 
sistent in their application of subject terms. Experimen- 
tal studies have demonstrated that different indexers 
will generally index the same document differently [9]. 
and even the same individual will not alwavs select the 
identical index terms if asked at a later time io index a 
document he or she has already indexed. The problems 

sociated with manual assignment of subject descrip- 
tors make computerized, full-text document retrieval 
extremely appealing. By entering the entire. or the 
most significant part of, a document text onto the data- 
base. one is freed. it is argued. from the inherent evils 
of manually creating document records reflecting the 
subject content of a particular document: among these. 
the construction of an indexing vocabulary, the train- 








ing of indexers. and the timé consumed in scanning / 
reading documents and assigning context and subject 
terms. The economies of full-text search are appealing, 
but for it to be worthwhile. it must also provide satis. 
factorv levels of retrieval effectiveness. 


MEASURING RETRIEVAL EFFECTIVENESS 

Two of the most widelv used measures of document. 
retrieval effectiveness are Recall and Precision. Recall 
measures how well a svstem retrieves all the relevant 
documents: and Precision. how well the svsiem Te. 
trieves only the relevant documents. For the purposes of 
this study. we define a document as relevant if it is 
judged useful bv the user who initiated the search. If 
not. then it is nonrelevant (see [4]). More preciselv. 
Recall is the proportion of relevant documents that the 
system retrieves, the ratio of x/n; (Figure 2). Notice that 
one can interpret Recall as the probability that a rele- 
vant document will be retrieved. Precision. ón the 
other hand. measures how well a system retrieves only 
the relevant documents; it is defined as the ratio x/n 
and can be interpreted as the probabilitv that a re- 
trieved document will be relevant. 





THE TEST ENVIRONMENT 

The database examined in this studv consisted of just 
under 40.000 documents. representing roughly 350.000 
pages of hard-copy text, which were to be used in the 
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FIGURE 1. The Dynamics of Information Retrieval 
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FIGURE 2. Definitions of Precision and Recall 


defense of a large corporate law suit. Access to the 
documents was provided by IBM's STAIRS/TLS soft- 
ware [STorage And Information Retrieval Svstem/The- 
saurus Linguistic Svstem). STAIRS software represents 
state-of-the-art software in full-text retrieval. It pro- 
vides facilities for retrieving text where specified words 
appear either singly or in complex Boolean combina- 
tions. A user can specify the retrieval of text in which 
words appear together anywhere in the document,  . 
within the same paragraph, within the same sentence, 
or adjacent to each other (as in "New"adjacent “York”). 
Retrieval can also be performed on fields such as au- 
thor. date. and document number. STAIRS provides 
ranking functions that permit the user to order re- 
trieved sets of 200 documents or less in either ascend- 
 ingor descending numerical (e.g.. bv date) or alphabetic 
{e.g.. by author) order. In addition, retrieved sets of less 
than 200 documents can also be ordered by the fre- 
quency with which specified search terms occur in the 
retrieved documents. The Thesaurus Linguistic System 
(TLS) provides the facilities to manually create an inter- 
active thesaurus that can be called up bv the user to 
semantically broaden (or narrow) his or her searches; it 
allows the designer to specify semantic relationships 
between search terms such as "narrower than.” 
“broader than." “related to.” “synonomous with,” as 
well a: as automatic phrase decomposition. STAIRS FLS 
thus represents a comprehensive full-text document- 
retrieval system. 


THE EXPERIMENTAL PROTOCOL 

To test how well STAIRS could be used to retrieve all 
and only the documents relevant to a given request for 
information, we-wanted in essence to determine the 
values of Recall (percentage of relevant documents re- 
trieved) and Precision (percentage of retrieved docu- 
ments that are relevant). Although Precision is an im- 
portant measure of retrieval effectiveness. it is mean- 
ingless unless compared to the level of Recall desired 
by the user. In this case, the lawyers who were to use 
the svstem for litigation support stipulated that they 
must be able to retrieve at least 75 percent of all the 
documents relevant to a given request for information, 
and that they regarded this entire 75 percent as essen- 
tial to the defense of the case. (The lawvers divided the 
relevant retrieved documents into three groups: “vital,” 
“satisfactory.” and “marginally relevant.” All other re- 
trieved documents were considered "irrelevant.") 
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CONDUCT OF THE TEST - 
For the test. we attempted to have the retrieval svstem 
used i in the same way it would have been gap actual 


in the suit. aries in the experiment. They gener- 
ated a total of 51 different information requests. which 
were translated into formal queries bv either of two 
paralegals. both of whom were familiar with the case 
and experienced with the STAIRS system. The parale- 
gals searched on the database until they found a set of 
documents thev believed would satisfy one of the ini- 
tial requests. The original hard copies of these docu- 
ments were retrieved from files. and xerox copies were 
sent to the lawver who originated the request. The law- 
ver then evaluated the documents. ranking them ac- 
cording to whether thev were “vital,” “satisfactory.” 
“marginally relevant.” or “irrelevant” to the original re- 
quest. The lawyer then made an overall judgment con- 
cerning the set of documents received. stating whether 
he or she wanted further refinement of the query and 
further searching. The reasons for any subsequent 
query revisions were made in writing and were fully 
recorded. The information-request and query- 
formulation procedures were considered complete onlv 
when the lawver stated in writing that he or she was 
satisfied with the search results for that particular 
query (i.e., in his or her judgment, more than 75 per- 
cent of the "vital," "satisfactory," and "marginally rele- 
vant" documents had been retrieved). It was only at 
this point that the task of measuring Precision and Re- 
call was begun. (A diagram of the information-request 
procedure is given in Figure 3.) The lawyers and paral- 
egals were permitted as much interaction as they 
thought necessary to ensure highly effective retrieval. 
The paralegals were able to seek clarification of the. 
lawvers' information request in as much detail and as 
often as they desired. and the lawyers were encouraged 
to continue requesting information from the database 
until they were satisfied thev had enough information 
to defend the lawsuit on that particular issue or query. 
In the test. each query required a number of revisions. 
and the lawvers were not generally satisfied until many 
retrieved sets of documents had been generated and 
evaluated. 

Precision was calculated by dividing the total num- , 
ber of relevant | (i.e. 
ally relevant") documents retrieved by the total num- 
ber of retrieved documents. If two or more retrieved 
with the results of the search, Шап the retrieved set 
considered for calculating Precision was computed as 
the union of all retrieved sets generated for that request 
(Documents that appeared in more than one retrieved 
set were automatically excluded from all but one set.) 

Recall was considerably more difficult to calculate 
since it required finding relevant documents that had 
not been retrieved in the course of the lawyers’ search. 
To find the unretrieved relevant documents, we devel- 
oped sample frames consisting of subsets of the unre- 
trieved database that we believed to be rich in relevant 
documents (and from which duplicates of retrieved rel- 
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FIGURE 3. The Information Request Procedure 


evant documents had been excluded). Random samples erated. The total number of relevant documents that 


were taken from these subsets. and the samples were existed in these subsets could then be estimated. We ` 
examined bv the lawyers in a blind evaluation; the sampled from subsets of the database rather than the 
lawvers were not aware they were evaluating sample entire database because, for most queries, the percent- 
sets rather than retrieved sets they had personally gen- age of relevant documents in the database was less than 
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`2 percent, making it almost impossible to have both 


manageable sample sizes and a high level of confidence 
in the resulting Recall estimates. Of course, no extrapo- 
Jation to the entire database could be made from these 
Recall calculations. Nonetheless, the estimation of the 
number of relevant unretrieved documents in the sub- 


sets did give us a maximum value for Recall for each 


‚ request. —— ас 
н i = à ^ + ы 


La ' 
TEST RESULTS 
Of the 51 retrieval requests processed, values of Preci- 
sion and Recall were calculated for 40. The other 11 
requests were used to check our sampling techniques. 
and control for possible bias in the evaluation of re- 
trieved and sample sets. 

In Table 1 we show the values of Precision and Recall 
for each of the 40 requests. The values of Precision 
ranged from a maximum of 100.0 percent to a mini- 
mum of 19.6 percent. The unweighted average value of 
Precision turned out to be 79.0 percent (standard devia- 
боп = 23.2). The weighted average was 75.5 percent. 
This meant that, on average, 79 out of every 100 docu- 
ments retrieved using STAIRS were judged to be rele- 
vant. | 

The values of Recall ranged from a maximum of 78.7 
percent to a minimum of 2.8 percent. The unweighted 
average value of Recall was 20 percent (standard devia- 
ton = 15.9). and the weighted average value was 20.26 


percent. This meant that. on average, STAIRS could be 


When we plot the value of Precision against the cor- 
responding value of Recall for each of the 40 informa- 
tion requests, we get the scatter diagram given in Fig- 
ure 4. Although Figure 4 contains no more data than 
Table I, it does show the relationships in a more ex- 
plicit way. For example, the heavy clustering of paints 
in the lower right corner shows that in over 50 percent 
of the cases we get values of Precision above 80 percent 


with Recall at or below 20 percent. The clustering in 
the lower portion of the diagram shows that in 80 per- 
cent of the information requests the value of Recall was 
at or below 20 percent. Figure 4 also depicts the fre- 
quently observed inverse relationship between Recall 


and Precisionjwhere high va 
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OTHER FINDINGS 

After the initial Recall/Precision estimations were 
done. several other statistical calculations were carried 
out in the hope that additional inferences couid be 
made. First, the results were broken down by lawyer to 
ascertain whether certain individuals were prima facie 


TABLE L Recall and Precision Values for Each Information Request 








informston information 
request request . 
number Recall Precision number Recall Precision 
1 2 e 27 50.0% 42.6% 
2 45.596 92.6% 28 50.0 19.6 
3 © = 29 & ё 
4 < a 30 7.0 100.0 
5 € s 31 = & 
6 8.9 60.0 32 12.5 100.0 
7 20.6 64.7 33 18.2 795 
8 43.9 88.8 34 14.1 45.1 
9 13.3 48.9 35 « ° 
10 10.4 96.8 35 4.2 33.3 
11 12.8 100.0 37 15.9 81.8 
12 9.6 84.2 38 24.7 68.3 
13 15.1 85.0 33 18.5 83.3 
14 78.7 99.0 40 4.1 100.0 
15 е s 41 18.3 96.9 
16 а е 42 45.4 91.0 
17 ° ы 43 18.9 100.0 
18 13.0 38.0 && 10.6 100.0 
19 15.8 42.1 45 20.3 94.0 
20 19.4 68.9 46 11.0 85.7 
21 41.0 33.8 47 13.4 100.0 
22 222 94.8 48 13.7 87.5 
23 2.B 100.0 49 17.4 87.8 
24 . ° 50 13.5 757 
25 13.0 94.0 51 4.7 100.0 
26 72 95.0 


lon are often 7 
. accompanied by low values for Recall. and vice versa 
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Average Recall = 20.0% «Standard deviation = 15.9) 
Average Precision = 79.0% «(Standard deviation = 23.3) 
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FIGURE 4. Plot of Precision versus Recall for All information Requests 


more adept at using the svstem than others. The results 
were as follows: 


Recall Precision 
Lawyer 1 22.7% 76.0% 
Lawyer 2 18.0% 81.4% 


Although there is some difference between the results 
for each lawyer. the variance is not statistically signifi- 
cant at the .05 level. Although this was a very limited 
test, we can conclude that at least for this experiment 
the results were independent of the particular user in- 
volved. 

Another area of interest related to the revisions made 
to requests when the lawyer was not completely satis- 
fied with the initial retrieved sets of documents. We 
hypothesized that if the values of Recall and Precision 
for the requests where substantial revisions had to be 
made (about 30 percent of the total) were significantly 
different from the overall mean values we might be 
able to infer something about the requesting procedure. 
Unfortunately, the values for Recall and Precision for 
the substantially revised queries (23.9 percent and 62.1 
percent, respectively) did not indicate a statistically sig- 
nificant difference. 

Finally, we tested the hypothesis that extremely high 
values of Precision for the retrieved sets would corre- 
late directly with the lawyers’ judgments of satisfaction 
with that set of documents (which might indicate that 
the lawyers were confusing Precision with Recall). To 
do this. we computed the mean Precision for all re- 
quests where the lawyers were satisfied with the initial 
retrieved set, and compared this value to the mean 
Precision for all requests. Although the Precision for 
requests that were not revised came out to be 85.4 
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percent. again the results were not statistically signifi- 
cant at the .05 level. 


The Retrieval Effectiveness of 

Lawvers versus Paralegals š 

The argument can be made that. because STAIRS isa 
high-speed. on-line. interactive svstem. the searcher at 
the terminal can quickly and effectively evaluate the 
output of STAIRS during the query modification proc- 
ess. Therefore. retrieval effectiveness might be signifi- 
cantly improved if the person originating the informa- 
tion request is actually doing the searching at the ter- 
minal. This would mean that if a lawyer worked di- 
rectly on the query formulation and query modification 
at the STAIRS terminal. rather than using a paralegal as 
intermediary. retrieval effectiveness might be im- 
proved. 

We tested this conjecture by comparing the retrieval 
effectiveness of the lawyer vis à vis the paralegal on the 
same information request. We selected (at random) five 
information requests for which the searches had al- 
ready been completed by the paralegal. and for which 
retrieved sets had been evaluated by the lawver and 
values of Recall computed. (Neither the lawver who 
made the relevance judgments nor the paralegal knew 
the Recall figures for these original requests.) We in- 
vited the lawver to use STAIRS directly to access the 
database, giving the lawyer copies of his or her original 
information requests. The lawver translated these re- 
quests into formal queries. evaluating the text dis- 
played on the screen, modifving the queries as he or 
she saw fit. and finally deciding when to terminate the 
search. For each of the five information requests, we 
estimated the minimum number of relevant documents 
in the entire file. and knowing which documents the 
lawyer had previously judged relevant. we were able to 
compute the values of Recall for the lawyer at the ter- 
minal as we had already done for the paralegal. If it 
were true that STAIRS would give better results when 
the lawyers themselves worked at the terminal, the 
values of Recall for the lawvers would have to be sig- 
nificantly higher than the values of Recall when the 
paralegals did the searching. The results were as fol- 
lows: 


Request Recall Recall 
number (paralegal) (lawyer) 
1 7.2% 6.6% 

2 19.4% . 10.3% 

3 4.2% 26.4% 

4 4.1% 74% 

5 18.9% 25.3% 
Mean 10.7% 15.2% 

(s.d. = 7.65) (s.d. = 9.83) 


Although there is a marked improvement in the law- 
ver's Recall for requests 3. 4. and 5. and in the average 
Recall for all five information requests, the improve- 
ment is not statistically significant at the .05 level 


(2 = —0.81) Hence, we cannot reject the hypothesis that 
X [d E $- % 
= i 
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both the lawver and the paralegal get the same results 
for Recall. | 


WHY WAS RECALL SO LOW 

The realization that STAIRS may be retrieving only one 
out of five relevant documents in response to an infor- 
mation request may surprise those who have used 
STAIRS or had it demonstrated to them. This is because 
they will have seen only the retrieved set of documents 
and not the total corpus of relevant documents: that is. 
they have seen that the proportion of relevant docu- 
ments in the retrieved set (i.e., Precision) is quite good 


(around 80 percent). The important issues to consider 
here are (1) why was Recall so low and (2) whv did the 
users (lawyers and paralegals) believe they were re- ~ 
trieving 75 percent of the relevant documents when. in 
fact, thev were onlv retrieving 20 percent. 

The low values of Recall occurred because full-text 


retrieval is difficult to use to retrieve documents by 
subject because its design is based on the assumption 
that it is a simple matter for users to foresee the exact 
words and phrases that will be used in the documents 
they will find useful, and only in those documents. This 
assumption is not a new one: it goes back over 25 years 
to the early davs of computing. The basic idea is that 
one can use the formal aspects of text to predict its 
meaning or subject content: formal aspects such as the 
occurrence. location, and frequency of words: and to 
the extent that it can be precisely described, the svn- 
tactic structure of word phrases. It was hoped that by 
exploiting the high speed of a computer to analyze the 
formal aspects of text. one could get the computer to 
deal with text in a “comprehending-like” way (Le.. to 
identify-the subject content of texts). This endeavor is 
known as "Automatic Indexing" or, in a more general 
sense. "Natural Language Processing.” During the past 
two decades, many experiments in automatic indexing 
(of which full-text searching is the simplest form) have 
been carried out. and many discussions by linguists, 
psychologists, philosophers, and computer scientists 
have analyzed the results and the issues [5]. These ex- 
periments show that full-text document retrieval has 
_worked well only on unrealistically small databases. 

| e belief in the predictability of the words ап 
phrases that may be used to discuss a particular subject 
is a difficult prejudice to overcome. In a naive sort of 
way. it is an appealing prejudice but a prejudice none- 
theless, because the effectiveness of full-text retrieval 
has not been substantiated by reliable Recall measures 
on realistically large databases. Stated succinctly, it is 
impossibly difficult for users to predict the exact words, 


1) word combinations, and phrases that are used by all (or 


most) relevant documents and only (or primarily) by 

those documents, as can be seen in the following exam- 
ple. Сес frawas et. aL H$3 seem 

In the legal case in question. one concern of the law- 
yers was an accident that had occurred and was now 
an object of litigation. The lawyers wanted all the re- 
ports, correspondence, memoranda. and minutes of 
meetings that discussed this accident. Formal queries 
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were constructed that contained the word "accident(s]^ 
along with several relevant proper nouns. In our search 
for ипгс?гістей relevant documents. we later found that 
the accident was not alwavs referred to as an “acci- 
dent." but as an "event." "incident." “situation,” ~prob- 
lem." or “difficulty.” often without mentioning any of 
the relevant proper names. The manner in which an 
individual referred to the incident was frequentlv de- 
pendent on his or her point of view. Those who dis- 
cussed the event in a critical or accusatorv way re- 
ferred to it quite directly—as an "accident." Those who 
were personally involved in the event. and perhaps 
culpable. tended to refer to it euphemisticallv as. inter 


alia. an “unfortunate situation," or a "difficulty." Some- | 


times the accident was referred to obliquely as “the 
subject of your last letter.” “what happened last week 
was ....” ог, as in the opening lines of the minutes of a 
meeting on the issue, “Mr. A: We all know why we're 
here .. .." Sometimes relevant documents dealt with 
the problem by mentioning only the technical aspects 
of why the accident occurred, but neither the accident 
itself nor the people involved. Finally. much relevant 
information discussed the situation prior to the accident 
and, naturally. contained no reference to the accident 
itself. 

Another information request resulted in the identifi- 
cation of 3 key terms or phrases that were used to 
retrieve relevant information; later, we were able to 
find 26 other words and phrases that retrieved addi- 
tional relevant documents. The 3 original key terms 
could not have been used individually as they would 
have retrieved 420 documents. or approximately 4000 
pages of hard copy, an unreasonably large set. most of 
which contained irrelevant information. Another re- 
quest identified 4 key terms/phrases that retrieved rel- 

evant documents. which we were later able to enlarge 
by 44 additional terms and combinations of terms to 
retrieve relevant documents that had been missed. 

Sometimes we followed a trail of linguistic creativity 
through the database. In searching for documents dis- 
cussing "trap correction" (one of the key phrases), we 
discovered that relevant. unretrieved documents had 
discussed the same issue but referred to it as the ^wire 
warp." Continuing our search. we found that in still 
other documents trap correction was referred to in a 
third and novel way: the “shunt correction system.” 
Finally, we discovered the inventor of this system was 
a man named “Coxwell” which directed us to some 

documents he had authored, only he referred to the 
system as the "Roman circle method." Using the Roman 
circle method in a query directed us to still more rele- 
vant but unretrieved documents, but this was not the 
end either. Further searching revealed that the systern 
-had been tested in another city, and all documents ger- 


Wes апе to those tests referred to the system as the "air 


truck." At this point the search ended. having con- 
sumed over an entire 40-hour week of on-line search- 
ing. but there is no reason to believe that we had 
reached the end of the trail; we simply ran out of time. 
As the database included many items of personal cor- 
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respondence as well as the verbatim minutes of meet- 
ings. the use of slang frequently changed the way in 
which one would “normally” talk about a subject. Disa- 
bled or malfunctioning mechanisms with which the 
lawsuit was concerned were sometimes referred to as 
“sick” or "dead." and a burned-out circuit was referred 
to as being "fried." A critical issue was sometimes re- 
ferred to as the "smoking gun." 

Even misspellings proved an obstacle. Key search 
terms like “flattening.” "gauge." "memos." and "corre- 
spondence,” which were essential parts of phrases, 
were used effectively to retrieve relevant documents. 
However, the misspellings “flatening.” “guage,” "gage." 
"memoes," and "correspondance," using the same 
phrases. also retrieved relevant documents. Mi 
ings like these. which are tolerable in normal every ryday 
correspondence. when included in a computeriz a- 
tabase become literal traps for users who are asked not 
only to anticipate the key words and phrases that may 
be used to discuss an issue but also to foresee the whole 

range of possible misspellings. letter transpositions, and 

раса s CECI 
| Some information requests placed almost impossible 
demands on the ingenuity of the individual construct- 
ing the query. In one situation, the lawver wanted 
"Company A's comments concerning ... .” Looking at 
the documents authored by Company А was not 
enough, as many relevant comments were embedded in 
the minutes of meetings or recorded secondhand in the 
documents authored by others. Retrieving all the docu- 
ments in which Company A was mentioned was too 
broad a search; it retrieved over 5.000 documents 
{about 40,000+ pages of hard copy). However, predict- 
ing the exact phraseology of the text in which Com- . 
pany A commented on the issue was almost impossible: 
sometimes Company À was not even mentioned, only 
that so-and-so (representing Company A) "said/consid- 
ered /remarked/ pointed out/commented/ noted /ex- 
plained /discussed.” etc. 

In some requests, the most important terms and 
phrases were not used at all in relevant documents. For 
example, “steel quantity” was a key phrase used to 
retrieve important relevant documents germane to an 
actionable issue, but unretrieved relevant documents 
were also found that did not report steel quantity at all, 
but merely the number of such things as “girders,” 
“beams,” “frames,” "bracings." etc. In another request, it 
was important to find documents that discussed “non- 
expendable components.” In this case, relevant unre- 
trieved documents merely listed the names of the com- 
ponents (of which there were hundreds) and made no 
mention of the broader generic description of these 
items as “nonexpendable.” 

Why didn't the lawyers realize they were not getting 
all of the information relevant to a icular issue? 
Certainly thev knew the lawsuit. They had been in- 
volved with it from the beginning and were the princi- 
pal attorneys representing the defense. In addition, one 
of the paralegals had been instrumental not only in 
setting up the database but also in supervising the se- 
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lection of relevant information to be put on-line. Might 
it not be reasonable to expect them to be suspicious 
that thev were not retrieving evervthing they wanted? 
Not really. Because the database was so large (providing 
access to over 350.000 pages of hard copv. all of which 
was in some way pertinent to the lawsuit). it would be 
unreasonable to expect four individuals (two lawvers 
and two paralegals} to have total recall of all the impor. 
tant supporting facts. testimony. and related data that 
were germane to the case. If thev had such recall they 
would have no need for a computerized, interactive 
retrieval svstem. It is well known among cognitive 

i literal recall is mu 
effective than his power of recognition. The lawyers 
could remember the exact text of some of the impor- 
tant information. but as we have already stated. this 
was a very small subset of the total information rele- 
vant to a particular issue. They could recognize the im- 
portant information when they saw it, and they could 
do so with uncanny consistency. (As a control, we sub- 
mitted some retrieved sets and sample sets of docu- 
ments to the lawyers several times in a blind test of 
their evaluation consistency, and found that their con- 
sistency was almost perfect.) Also. since the lawyers 
were not experts in information retrieval system de- 
sign. there were no a priori reasons for them to suspect 
the Recall levels of STAIRS. 









less 


DETERIORATION OF RECALL AS 

A FUNCTION OF FILE SIZE 

One reason why Recall evaluations done on small data- 
bases cannot be used to estimate Recall on larger data- 
bases is because. ceteris paribus, the value of Recall 
decreases as the size of the database increases, or. fro 

a different point of view, the amount of search effort E 
required to obtain the same Recall level increases as 

the database increases, often at a faster rate than the 
increase in database size. On the database we studied, 
there were manv search terms that, used by them- 
selves, would retrieve over 10.000 documents. Such 


output overload is a frequent problem of full-text re- 
trieval systems. _ 

As a retrieved set of several thousand documents is 
impractical. the user must reduce the output overload 
by reformulating the single-term query so that it re- 
trieves fewer documents. If a single term query w, re- 
trieves too many documents. the user may add another 
term, 272, so as to form the new query “w, апа 10." (or 
“w, adjacent w,” or “ш, same 102"). The reformulated 
query cannot retrieve more documents than the origi- 
nal; most probably, it will retrieve many fewer. The 
process of adding intersecting terms to a query can be 
continued until the size of the output reaches a man- 
ageable number. (This strategy, and its consequences, is 
discussed in more detail in [1].) However. as the user 
narrows the size of the output br adding intersecting 
terms, the value of Recall goes down because, with 
each new term, the probabilitv is that some relevant 
documents will be excluded by that reformulated 
query. 
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Tae deterioration of Recall from a probabilistic point 

‘view is quite startling. For each query. there is a 

lass of relevant documents that we designate as R. We 
-epresent the probability that each of those documents 
will contain some word ш, as p. and the probability that 
з relevant document will contain some other word 10, 
as 4. Thus. the value of Recall for a request using only 
w, will be equal to p. and Recall for a request using 
only wz will be equal to q. Now the probability that a 
relevant document will contain both w, and w; is less 
than or equal to either p or 4. If we assume that the 
respective appearances of ш, and wz in a relevant docu- 
ment are independent events, then the probability of 
both of them appearing in a relevant document would 
be equal to the product of p and а. Since both p and 4 
are usuallv numbers less than unity, their product usu- 
ally will be smaller than either p or q. This means that 
Recall. which can also be thought of as the probability: š: 





we consider a three- or four-term querv. the value of 
Recall drops off even more sharply. 

The problem of output overload is especially critical 
in full-text retrieval systems like STAIRS. where the 
frequencv of occurrence of search terms is considerablv 
larger than (and increases faster than) the frequencv of 
occurrence (ог "breadth"] of index terms in a database 
where the terms are manually assigned to documents. 
This means that the user of a full-text retrieval svstem 
will face the problem of output overload sooner than 
the user of a manually indexed svstem. The solution 
that STAIRS offers—coniunctivelv adding search terms 
to the query—does reduce the number of documents 
retrieved to a manageable number but also eliminates 
relevant documents. Search queries employing four or 
five intersecting terms were not uncommon among the 
queries used in our test. However, the probabilitv that 


. aquery that intersects five terms will retrieve relevant 


of retrieving a relevant document, is now equal to the чий documents is quite small. If we were to assign a proba- 


product of p and q. In other words, reducing the num- ш 
ber of documents retrieved by intersecting an increas- 
ing number of terms in the formal query causes Кеса" 
for that query also to decrease. 


CT 
However, the problem is really much worse. In айы: & 


for a relevant document, which contains zo, and wz, to 
be retrieved bv a single query, a searcher must select 
and use those words in his or her query. The probabil- - 
ity that the searcher will select w, is, of course, gener- 
ally less than 1.0; and the probability that ш, will occur 
in a relevant document is also usually less than 1.0. 
However, these probabilities must be multiplied by the 
probability that the searcher will select wz as part of his 


or her query. and the probability that zo; will occur in a - 


relevant document. Thus, calculating Recall for a two- 
term search involves the multiplication of four num- 

bers each of which is usually less than 1.0. As a result, 
the value of Recall gets very small (see Table II. When 


TABLE IL The Probability of Retrieving a Relevant Document 
Containing Terms w, and и, 


P(Sw,) = .6 = Probability searcher uses term w, in a search 





query 
P(Sws) = .5 = Probability searcher uses term wz in a search 


query 
P(Dw.) = .7 = Probability w, appears in a relevant document 
P (Dws) = .6 = Probability wə appears in a relevant document 
Probability of searcher selecting w, and a relevant document 
containing ws: 


P(Sw,) х P(Ow,) = (8) x (7) = .42 
Probabüty of searcher selecting we and a relevant document 
containing wa: 
P(Sw2) x P(Dws) = (5) x (.6) = 30 


Probability of searcher Selecting w, and м: and a relevant doc- 
ument contaming w, and ws: 


P(Sw,) x P(Dw,) x P(Sw2) x P(Dwz) 
(e.g.. P(.6) x Р(.7) x PCS) x PC6) =[.126 
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ility of .7 to all the respective probabilities in a hypo- 
thetical five-term query as we did in the two-term 
query in Table II (and .7 is an optimistic average value). 
the Recall level for that query would be .028. In other 
words. that querv could be expected to retrieve less 
than 3 percent of the relevant documents in the data- 
base. If the probabilities for the five-term query were a 
more realistic average of .5. the Recall value for that 
query would be .0009! This means that if there were 
1000 relevant documents on the database, it is likelv 
that this query would retrieve only one of them. The 
searcher must submit many such low-vield queries to 
the system if he or she wants to retrieve a high percent- 
age of the relevant documents. 


DISCUSSION 

The reader who is surprised at the results of this test of 
retrieval effectiveness is not alone. The lawyers who 
participated in the test were equally astonished. Al- 
though there are sound theoretical reasons why we 
should expect these results. they seem to run counter 
to previous tests of retrieval effectiveness for full-text 
retrieval. 

Two pioneering evaluations of full-text retrieval svs- 
tems by respected researchers in the field (Swanson [5] 
and Salton [3]) determined to their satisfaction that 
full-text document-retrieval systems could retrieve rel- 
evant documents at a satisfactory level while avoiding 
the problems of manual indexing. Our study, on the 
other hand, shows that full-text document retrieval 
does not operate at satisfactory levels and that there are 
sound theoretical reasons to expect this to be so. Who is 
right? Well, we all are. and this is not an equivocation. 
The two earlier studies drew the correct conclusions 
from their evaluations. but these conclusions were dif- 
ferent from ours because they were based on small 
experimental databases of less than 750 documents. 
Our study was done not on an experimental database 
but an actual, operational database of almost 40.000 
documents. Had Swanson and Salton been fortunate 
enough to study a retrieval system as large as ours, they 
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wouid undoubtedly have observed similar phenomena 
(Swanson was later to comment perceptively on the 
difficulty of drawing accurate conclusions about docu- 
ment retrieval from experiments using small databases 
[7 |). In addition. it has only recently been observed that 
information-retrieval systems do not scale up [2]. That 
is. retrieval strategies that work well on small systems 
do not necessarily work well on larger systems (primar- 
ily because of output overload). This means that studies 


of retrieval effectiveness must be done on full-sized ` 


retrieval svstems if the results are to be indicative of 
how a large. operational system would perform. How- 


ever. large-scale. detailed retrieval-effectiveness stud- 
ies. like the one reported here. are unprecedented be- 
cause they are incredibly expensive and time consum- 
ing: our experiment took six months; involved two re- 
searchers and six support staff: and, taking into account 
all direct and indirect expenses. cost almost half a mil- 
lion dollars. Nevertheless. Swanson and Salton’s earlier 
full-text evaluations remain pioneering studies and, 
rather than contradict our findings, have an illaminat- 
ing value of their own. 

An objection that might be made to our evaluation of 
STAIRS is that the low Recall observed was not due to 
STAIRS but rather to querv-formulation error. This ob- 
jection is based on the realization that, at least in prin- 
ciple. virtually any subset of the database is retrievable 
by some simple or complex combination of search 
terms. The user's task is simply to find the right combi- 
nation of search terms to retrieve all and only the rele- 
vant documents. However, we believe that users should 
not be asked to shoulder the blame, and perhaps an 
analogy will indicate why. Suppose you ask a company 
to make a lock for you, and they oblige by providing a 
combination lock; but when you ask them for the com- 
bination to open the lock, they say that finding the 
correct combination is your problem, not theirs. Now, it 
is possible, in principle. to find the correct combination, 


but in practice it may be impossibly difficult to do so. A 


full-text retrieval system bears the burden of retrieval 
failure because it places the user in the position of 
having to find (in a relatively short time) an impossibly 
difficult combination of search terms. The person using 
a full-text retrieval svstem to find information on a 
relatively large database is in the same unenviable po- 
sition as the individual looking for the combination to 
the lock. It is true that we, as evaluators, found the 


„де“ combinations of search terms necessary to retrieve 
б many ofthe unretrieved relevant documents. but three 


met 
as 
= a 
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things should be kept in mind. First, we make no claim 
to having found all the relevant unretrieved docu- 
ments; we may not have found even half of them, as 
our sampling technique covered oniy a small percent- 
age of the database. Second, a tremendous amount of 
search time was involved with each request (sometimes 
over 40 hours of on-line time), and the entire test took 
almost 6 months. Such inefficiency is clearly not conso- 
nant with the high speed desired for computerized re- 
trieval. Third, the evaluators in this case represented. 


together, over 40 years of practical and theoretical ex- 
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sold under the premise that it is easy to use and re- 
quires no sophisticated training on the part of the user. 
Yet this study is a clear demonstration of just how 
sophisticated search skills must be to use STAIRS. or. 
mutatis mutandis. апу other full-text retrieval system. 
There is evidence that this problem is beginning to be 
recognized by at least one full-text retrieval vendor, 
WESTLAW, which has made its reputation by offering 
full-text access to legal cases. WESTLAW has now be- 
gun to supplement its full-text retrieval with manually 
assigned index terms. 


SUMMARY 

This paper has presented a major, detailed evaluation 
of a full-text document-retrieval system. We have 
shown that the system did not work well in the envi- 
ronment in which it was tested and that there are theo- 
retical reasons why full-text retrieval systems applied 
to large databases are unlikely to perform well in any 
retrieval environment. The optimism of early studies 
was based on the small size of the databases used. and 
were geared toward showing only that full-text search 
was competitive with searching based on manually as- 
signed index terms, under the assumption that, if it 
were competitive, full-text retrieval would eliminate 
the cost of indexing. However, there are costs associ- 
ated with a full-text system that a manual system does 
not incur. First, there is the increased time and cost of 
entering the full text of a document rather than a set of 
manually assigned subject and context descriptors. The 
average length of a document record on the system we 
evaluated was about 10.000 characters. In a manually 
assigned index-term system of the same type, we found 
the average document record to be less than 500 char- 
acters. Thus. the full-text system incurs the additional 
cost of inputting and verifving 20 times the amount of 
information that a manually indexed system wou 

need to deal with. This difference alone would more 
than compensate for the added time needed for manual 
indexing and vocabulary construction. The 20-fold 
increase in document record size also means that the 


- = = 


database for a full-text system will be some 20 times 


larger than а manually indexed database and entail 
—Ancreased storage and searching costs. Finally, because 


the average number of searchable subject terms per 
document for the full-text retrieval system described 
here was approximately 500, whereas a manually in- 
dexed system might have a subject indexing depth of 
about 10, the dictionary that lists and keeps track of 
these assignments (i.e.. provides pointers to the data- 
base) could be as much as 50 times larger on a full-text 
system than on a manually indexed system. A full-text 
retrieval system does not give us somet ng for nothing. 
Full-text searching is one of those things, as Samuel 
Johnson put it so succinctly, that “- .. is never done 
well, and one is surprised to see it done at all." 
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Neural Networks: Applications in 
Industry, Business and 


= * 


BERNARD WIDROW E DAVID E. RUMELHART E MICHAEL A. LEHR 


J ust four years ago, the only widely reported commercial application of neural network 
technology outside the financial industry was the airport baggage explosive detection system 
[27] developed at Science Applications International Corporation (SAIC). Since that time 
scores of industrial and commercial applications have come into use, but the details of most of 
these systems are considered corporate secrets and are shrouded in secrecy. This hastening trend 
is due in part to the availability of an increasingly wide array of dedicated neural network 
hardware. This hardware is either in the form of accelerator cards for PCs and workstations or a 
large number of integrated circuits implementing digital and analog neural networks either 
currently available or in the final stages of design. An assortment of tools and development 
systems is provided by the manufacturers of most of these products. 


Complementing the hardware are 
scores of commercial software pack- 
ages now available. Many packages 
can be quickly tailored to provide 
low-cost turnkey solutions to a broad 
spectrum of applications. À very use- 
ful list containing 64 of these soft- 
ware and hardware tools together 
with their prices and the names, ad- 
dresses, and phone numbers of the 
vendors is published in a recent issue 
of the magazine РС АЈ [17]. Other 
valuable lists of neural network tools 
and vendors can be found in the Feb- 
ruary issue of Dr. Dobb's Journal [11] 
and the June 1992 issue of AI Expert. 
That these lists are not complete is an 
indication of the rapid growth the 
field is presently enjoying. It is not 
possible in a short article to cite all of 


the exisung applications. The exam- 
ples described herein are meant only 
to be representative samples. 


Linear Neural Network 
Applications 

The first successful applications of 
adaptive neural networks were de- 
veloped by Widrow and Hoff in the 
1960s. They employed single-neuron 
linear. networks trained by the LMS 
algorithm [32]. Single-element and 
multielement linear. networks are 
equally easy to train and have found 


widespread commercial application ` 


over the past three decades. A few of 
these applications include: 

• Telecommunications. Modems 
used in the high-speed transmission 
of digital data through telephone 


channels use adaptive line equalizers 
and adaptive echo cancellers. Each 
adaptive system utilizes a single- 
neuron neural network. The most 
significant commercial application of 
neural networks today is in this area. 
e Control of sound and vibration. 


Active control of vibration and noise 


is accomplished by using an adaptive 
actuator to generate equal and oppo- 
site vibration and noise. This is being 
used in air-conditioning systems, in 
automotive systems, and in industrial 
applications. 

e Particle accelerator beam control. 
The Stanford Linear Accelerator 
Center (SLAC) is now using adapuve 
techniques to cancel disturbances 
that diminish the positioning accu- 
racy of opposing beams of positrons 
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Gerber Baby Foods uses neural networks го 





and electrons in a particle collider. 
The accuracy is being held to within 
2 microns in order to have a satisfac- 
tory number of collisions. The effi- 
ciency of this 3-kilometer long, bil- 
lion dollar machine is being 
enhanced by the use of linear adap- 
tive noise cancelling. 


Multielement Nonlinear 
Network Applications 

Unlike their linear counterparts 
which have a long track record of 
success, nonlinear multielement neu- 
ral networks have begun proving 
themselves in commercial applica- 
tions only recently. This is largely 
because the most useful neural net- 
work  algorithm—backpropagation 
—did not become widely known 
until 1986, when it was published in 
Rumelhart and McClelland’s two- 
volume PDP set [21]. Also important 
in the timing of the current boorn in 
nonlinear neural network applica- 
tions has been the rapid advance of 
computer and microprocessor per- 
formance, which continues to im- 
prove the feasibility and cost-effec- 
üveness of computationally intensive 
algorithms. Although nonlinear neu- 
ral networks are not currently being 
used as widely as linear networks, 
they are applicable to a much 
broader range of problems than 
their linear counterparts. Further- 
more, the applications for which they 
are best suited often involve complex 
nonlinear relationships for which 
acceptable classical solutions are un- 
available. 

Successful commercial applica- 
tions of nonlinear multielement neu- 
ral networks in most cases currently 
rely on the backpropagation algo- 
rthm, with some use of back- 
propagation-through-time [30], ra- 
dial basis functions [11], genetic 
algorithms [3, 24], Kohonen's Learn- 
ing Vector Quantization (LVQ) [9], 
and a number of other algorithms. 
Whatever the paradigm, neural net- 
works are currently being used 


throughout business and industry to 
satisfy a diverse assortment of needs. 
Most neural network applications 
address problems described by one 
of the following three categories: 
1) pattern classification, 2) prediction 
and financial analysis, and 3) control 


‘and optimization. Examples from 


each category follow: 


Pattern Classification 

Credit card fraud detection. Several 
banks and credit card companies in- 
cluding American Express, Mellon 
Bank, First USA Bank, and others 
are currently using neural networks 
to study patterns of credit card usage 
and to detect transactions that are 
potentially fraudulent [8, 10, 26]. 
Credit card fraud is a growing prob- 
lem that threatens the entire indus- 
try. Some institutions are using 
home-grown software, while others 
are using commercial products de- 
veloped by Nestor, HNC, and other 
companies. 

Machine-printed character recog- 
nition. Commercial products per- 
forming machine-printed character 
recognition have been introduced by 
a large number of companies and 
have been described in the literature. 
Among these products are those 
made by Sharp Corp. [9, 26], Mitsu- 
bishi Electric Corp. [9], VeriFone 
Inc. [8, 9, 11, 26], Hecht-Nielsen 
Corp. (HNC) [11], Nestor Inc. [33], 
Calera Recognition Systems Inc. [11], 
Caere Corp. [11], and Audre Recog- 
nition Systems [11]. Sharp’s Optical 
Character Recognition (OCR) system 
is used to recognize Japanese charac- 
ters. It contains approximately 10 
million weights and uses a variant of 
Kohonen’s LVQ algorithm. It out- 
performs existing conventional sys- 
tems in speed and accuracy. Mitsubi- 
shi is currently developing a similar 
system [9]. VeriFone’s Onyx Check 
Reader provides an accurate, low- 
cost system for reading identification 


numbers on checks by using a custom | 


analog neural net chip made by Syn- 


manage its trade in cattle futures. ..Spiegel is using 
are to determine which customers get their catalogs. 


aptics. Calera Recognition Systems 
markets a product, FaxGrabber, 
which automatically converts incom- 
ing faxes to text using a modified 
radial basis function neural network 
to perform OCR. Highlighting the 
secrecy with which many firms guard 
their reliance on neural network 
technology, Calera did not acknowl- 
edge their use of the technology 
(which began in 1986) until 1992 
when competitor Caere Corp. an- 
nounced the use of neural nets in 
Caere’s highly successful AnyFax 
OCR engine. AnyFax is used in 
Caere’s FaxMaster software and is 
licensed for use in other products 
including Delrina Technology Inc.’s 
WinFax Pro 3.0 fax software. Audre 
Recognition Systems uses a variant of 
the backpropagation algorithm in its 
OCR product, the Audre Neural 
Network, which not only reads stan- 


` dard alphanumerics but can also be 


trained to recognize specialized sym- 
bols on engineering drawings [11]. 

Hand-printed character recogni- 
tion. HNC’s Quickstrokes Auto- 
mated Data Entry System is being 
used to recognize handwritten forms 
at Avon’s order-processing center 
and at the state of Wyoming’s De- 
partment of Revenue. In the June 
1992 issue of Systems Integration Busi- 
ness, Dennis Livingston reports that 
before implementing the system, 
Wyoming was losing an estimated 
$300,000 per year in interest income 
because so many checks were being 
deposited late. Cardiff Software of- 
fers a product called Teleform which 
uses Nestor’s hand-printed character 
recognition system to convert a fax 
machine into an OCR scanner. Poget 
Computer, now a subsidiary of Fu- 
jitsu, uses Nestors NestorWriter 
neural network software to perform 
handwriting recognition for the pen- 
based PC it announced in Jamuary 
1992 [25]. 

Cursive handwriting recognition. 
Neural networks have proved useful 
in the development of algorithms for 
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on-line cursive handwriting recogni- 
tion [20]: A recent startup company 
in Palo Alto, Lexicus, beginning with 
this basic technology has developed 
an impressive PC-based cursive 
handwriting system. 

Quality control in manufactur- 
ing. Neural networks are being used 
in a large number of quality control 
and quality assurance programs 
throughout industry. Applications 
include contaminant-level detection 
from spectroscopy data at chemical 
plants [11, 14] and loudspeaker de- 
fect classification by CTS Electronics 
[1]. According to Justin Kestelyn in 
the June 1990 issue of AJ Expert, neu- 
ral networks are also being used by 
the Florida Department of Citrus to 
perform orange juice purity evalua- 
tion. Applied Intelligent Systems of 
Ann Arbor, Mich., has built into its 
vision computers neural recognition 
features that are used for quality 
control in factories [11]. 

Event detection in particle accel- 
erators. Research into the feasibility 
of using neural networks to detect 
notable events in high-energy parti- 
cle colliders has been performed at 
the European Center for Particle 
Physics (CERN), and at a number of 
other research organizations [5]. 
Steven Kasow of CERN has reported 
that scientists there are using fast 
analog neural networks in real-time 
triggering systems for detectors. This 


permits the distillation of an enor- 


mous number of candidate events 
into a manageable set of “interesting” 
events which can be recorded оп 
mass-storage devices and studied 
further. Neural networks are prov- 
ing especially useful and cost-effec- 
tive when used in experiments for 
which complex criteria are needed to 
differentiate between interesting and 
uninteresting events. Similar work is 
taking place at the Fermi National 
Accelerator Laboratory, Batavia, Ill., 
using Intels high-speed analog 
ETANN neural network chip, ac- 
cording to the June 1993 issue of the 
Cognizer Report newsletter. 
Petroleum exploration. Oil com- 
panies including Arco and Texaco 
are using neural networks to help 
determine the locations of under- 
ground oil and gas deposits [25]. 
War on drugs. Yes, neural. net- 
works have even made their debut in 


the U.S. government's famous war 
on drugs. PC-based software emulat- 
ing a multilayer neural network is 
being used on a daily basis at the 
North Carolina State Bureau of In- 
vestigation (NCSBI) to help forensic 
experts identify cocaine samples 
originating from the same batch. J. F. 
Casale and J. W. Watterson report in 
the March 1993 issue of the Journal of 
Forensic Sciences that the information 
helps undercover agents put to- 
gether drug-related criminal cases. 

Medical applications. Commer- 
cial products by Neuromedical Sys- 
tems, Inc. are used for cancer screen- 
ing and other medical applications 
[8, 9, 11, 19, 26]. The company mar- 
kets electrocardiograph and pap 
smear systems that rely on neural 
network technology. The pap smear 
system, Pafmet, is able to help cyto- 
technologists spot cancerous cells, 
drastically reducing false/negative 
classifications. The system is used by 
the U.S. Food and Drug Administra- 
tion [6]. 


Prediction and Financial Analysis 
Financial forecasting and portfolio 

Neural networks are 
used for financial forecasting at a 
large number of investment firms 
and financial entities including Mer- 
rill Lynch & Co., Salomon Brothers, 
Shearson Lehman Brothers Inc., Cit- 
ibank, and the World Bank [3, 9, 24, 
25]. Gerber Baby Foods reportedly 
uses neural networks to help manage 
its trade in cattle futures [6]. Using 
neural networks trained by genetic 
algonthms, Ciubank’s Andrew Colin 
claims to be able to earn 25% returns 
per year investing in the currency 
markets. A startup company, Prom- 
ised Land Technologies, offers a 
$249 software package that is 
claimed to yield impressive annual 
returns [24]. 

Loan approval. Chase Manhattan 
Bank reportedly uses a hybrid system 
utilizing pattern analysis and neural 
networks to evaluate corporate loan 
risk. Robert Marose reports in the 
May 1990 issue of AI Expert that the 
system, Creditview, helps loan offi- 
cers estimate the credit worthiness of 
corporate loan candidates. 

Real estate analysis. HNC's Areas 
Automated Property Valuation Sys- 
tem [8] is being used by Foster Ous- 
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ley Conley to evaluate the value of 
residential property in California. 

Marketing analysis. The Target 
Marketing System developed by 
Churchill Systems is currently in use 
by Veratex Corp. to optimize mar- 
keting strategy and cut markeung 
costs by removing unlikely future 
customers from a list of potential 
customers [8]. Likewise, Spiegel Inc. 
is using software created by Neural- 
Ware Inc. to determine which cus- 
tomers should receive their mail 
order catalogs. Spiegel’s director of 
market research expects savings of at 
least $1 million per year based on 
increased sales and reduced catalog 
mailings [25]. 

Airline seating allocation. The 
Airline Marketing Assistant/Tacti- 
cian developed by BehavHeuristics 
Inc. uses neural networks to predict 
passenger demand and allocate seat- 
ing for carriers including Nationair 
Canada and USAir [8]. 


Control and Optimization 

Electric arc furnace electrode posi- 
tion control. Electric arc furnaces are 
used to melt scrap steel. The Intelli- 
gent Arc Furnace controller systems 
installed by Neural Applications 
Corp. [8, 28] are reportedly saving 
millions of dollars per year per fur- 
nace in increased furnace through- 
put and reduced electrode wear and 
electricity consumption. The control- 
ler is currently being installed at fur- 
naces worldwide. 

Semiconductor process control. 
Kopin Corp. has used neural net- 
works to cut dopant concentration 
and deposition thickness errors in 
solar cell manufacturing by more 
than a factor of two [9]. 

Chemical s control Pavil- 
ion Technologies has developed a 
neural network process control pack- 
age, Process Insights, which is help- 
ing Eastman Kodak and a number of 
other companies reduce waste, im- 
prove product quality, and increase 
plant throughput [4, 8, 9, 11, 12]. 
Neural network models are being 
used to perform sensitivity studies, 
determine process set points, detect 
faults, and predict process perfor- 
mance. 3 

Petroleum refinery process con- 
trol. Texaco's Puget Sound Refinery, 
which processes 120,000 barrels of 
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oil a day, has integrated neural! net- 
works into the plant’s process control 
systems. As described in the June 
1990 issue of AI Expert, one of these 
networks is used in the control of a 
debutanizer, a system which sepa- 
rates hydrocarbons according to 
their molecular weights. This re- 
quires precise monitoring of temper- 
atures, pressures, and flow rates. 
The 17-hour batch cycle subjects the 
process to constant instability. A neu- 
ral network has been built and 
trained to help ensure product qual- 
ity during periods of change and in- 
stability. The seven-input, two- 
output network, which was trained 


with roughly 1,500 data samples, 15 


usually able to correct errors in the 
control parameters before they ap- 
pear. A feedback mechanism helps 
reduce unexpected errors that do 
occur. 

Continuous-casting control dur- 
ing steel production. A neural con- 
trol system is in operation in Japan at. 
plants owned by Fujitsu Ltd. and 
Nippon Steel Corp. The system has 
reduced costs by several million dol- 
lars a year by eliminating the damage 
and downtime caused by “breakout,” 
when imperfect control allows spill- 
age of molten steel [9, 26, 30]. The 
system uses a feedforward network 
trained by backpropagation to detect 
breakout before it occurs, allowing 
corrective measures to be taken. The 
control system has been. operating 
since early 1990. 

Food and chemical formulation 
optimization. Neural networks are 
used to optimize formulations at the 
Glidden Co., the Lord Corp. [7], and 
at M&M/Mars. Researchers at the 
first two companies report success 
using AI Ware’s CAD/Chem package 
to search for improved chemical for- 
mulations. CAD/Chem has been used 
by Lord Corp. in the process of 
formulating a new adhesive product 
[7] by an iterative search technique. 


Nonlinear Applications on the 


‘Horizon | 


A large number of research pro- 
grams are developing neural net- 
work:solutions that are either likely 
to be used in products in the near 
future or, particularly in the case of 
military applications, that may al- 
ready be incorporated into products, 


albeit unadvertised. This category is 
much larger than the foregoing, so 
we present here only a few represen- 
tative examples 

Missile guidance and detonation. 
David Andes at the U.S. Naval Air 
Warfare Center, China Lake, Calif., 
has worked for several years using 
analog neural networks and the 
MRIII algorithm [2] in missile guid- 
ance and other military applications 
[26]. He has found that when fast 
decisions are required, neural net- 
works have enormous advantages 
over conventional methods. 

Fighter flight and battle pattern 
guidance. Defense contractors have 
apparently developed software using 
neural networks to integrate multi- 
source data for flight and battle pat- 
tern guidance of Lockheed's YF-22 
Advanced Tactical Fighter based on 
real-time predictions of the immi- 
nent actions of an enemy aircraft. It 
is unclear, however, if such a system 
is operational [24]. 

Optical telescope focusing. Neu- 
ral networks can be used to compen- 
sate for atmospheric disturbances by 
adaptively deforming mirror ele- 
ments in response to atmospheric 
activity that can blur images. In stra- 
tegic defense initiative-related work, 
Lockheed Missiles and Space Co. has 
developed a proprietary neural mi- 
crochip that drives an adaptive fo- 
cusing system for laser/mirror sys- 
tems. This allows relatively small 
telescopes to rival much larger and 
more expensive ones. Colin Johnson 
reports in the November 19, 1990 
issue of the Electronic Engineering 
Times that the first generation of the 
system had 69 piezoelectric actuators 
mounted on the back of the mirror to 
adjust it to the desired shape. Experi- 
ments with a similar idea utilizing a 
multiple mirror telescope are also 
described in the literature [22]. | 

Vehicular trajectory control. 
Neural networks can be used to solve 
highly nonlinear control problems. A 
two-layer neural network containing 
26 adaptive neural elements has 
learned to back up a computer- 
simulated trailer truck, even when 
initially “Jackknifed.” The neural net 


_was able to learn of its own accord to 


do this, regardless of initial condi- 
tions. Experience gained with the 
truck backer-upper should be appli- 


cable to a wide variety of nonlinear 
control problems [15]. 

Automotive applications. Ford 
Motor Co., General Motors, and 
other automobile manufacturers are 
currently researching the possibility 
of widespread use of neural net- 
works їп automobiles and in automo- 
bile production. Some of the areas 
that are yielding promising results in 
the laboratory include engine fault 
detection and diagnosis, antilock 
brake control, active-suspension con- 
trol, and idle-speed control. General 
Motors is having preliminary success 
using neural networks to model sub- 
jective customer ratings of automo- 
biles based on their dynamic charac- 
teristics to help engineers tailor 
vehicles to the market. 

Electric motor failure iction. 
Siemens has reportedly developed a 
neural network system that can accu- 
rately and inexpensively predict fail- 
ure of large induction motors [26]. 
The system achieves 80% to 90% 
overall failure prediction accuracy in 
comparison to 30% achieved by the 
best conventional techniques. The 
predictor will be integrated into Sie- 
mens's existing Advanced Motor 
Master System (SAMMS) controller. 

Speech recognition. The Stanford 
Research Institute (SRI) is currently 
involved in research combining neu- 
ral networks with hidden Markov 
models (HMM) and other technolo- 
gies in a highly successful speaker- 
independent speech recognition sys- 
tem. The technology will most likely 
be licensed to interested companies 
once perfected. 

Mass spectra classification. Bo 
Curry of Hewlett-Packard Labs col- 
laborated with David Rumelhart on 
the design of a feedforward neural 
network to classify low-resolution 
mass spectra of unknown com- 
pounds according to the presence or 
absence of 100 organic substruc- 
tures. Described in HPL Technical 
Report 90—161, 1990, the neural 
network MSnet was trained to com- 
pute a maximum-likelihood estimate 
of the probability that each substruc- 
ture is present. MSnet classifies mass 
spectra more reliably than other 
methods reported in the literature, is 
much faster than the standard 
nearest-neighbor techniques, and 
because of the probabilistic interpre- 
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tation of the classification output, can 
readily be combined with other in- 
formation sources. 

Biomedical applications. Neural 
networks are rapidly finding diverse 
applications in the biomedical sci- 
ences. They are being used widely in 
research on amino acid sequencing 
in proteins, nucleotide sequencing in 
RNA and DNA, ECG and EEG wave- 
form classification, prediction of pa- 
tients’ reactions to drug treatments, 
prevention of anesthesia-related ac- 
cidents, arrhythmia recognition for 
implantable defibrillators, patient 
mortality predictions, quantitative 
cytology, detection of breast cancer 
from mammograms, modeling schiz- 
ophrenia, clinical diagnosis of lower- 
back pain, enhancement and classifi- 
cation of medical images, lung nod- 
ule detection, diagnosis of hepatic 
masses, prediction of pulmonary 
embolism likelihood from ventila- 
üon- ion lung scans, and the 
study of interstitial lung disease. 

Drug development. One particu- 
larly promising area of medical re- 
search involves the use of neural net- 
works in predicting the medicinal 
properties of substances without 
expensive, time-consuming, and 
often inhumane animal testing [29]. 
For cancer drug screening, this has 
been accomplished by testing the ef- 
fects that a group of 134 known 
drugs have on the growth of cultures 
of 60 types of human tumor cells. 
These profiles were then applied to a 
feedforward neural network simu- 
lated using NeuralWare's Profes- 
sional IUPLUS software package and 
trained by backpropagation to clas- 
sify each drug by mechanism of ac- 
tion. Cross-validation studies showed 
this method to be surprisingly accu- 
rate. The profiles of prospective 
drugs with unstudied medicinal 
properties could then be applied and 
classified by the network. More ex- 
tensive tests would be performed 
only on the small proportion of pro- 
spective drugs placed by the network 


in classes thought to be useful or in- 
teresting. 

Control of copiers. The Ricoh 
Corp. has successfully employed 
neural learning techniques for con- 
trol of several voltages in copiers in 
order to preserve uniform copy qual- 
ity despite changes in temperature, 
humidity, time since last copy, time 
since change in toner cartridge, and 
other variables. These variables in- 
fluence copy quality in highly nonlin- 
ear ways, which were learned 
through training of a backpropaga- 
tion network. In order to improve 
generalization and reduce the size of 
the networks in copiers, Ricoh em- 
ployed a sophisticated network- 
pruning method, which they call 
Optimal Brain Surgeon, which in- 
deed led to smaller and more accu- 
rate networks. 


More Detailed Descriptions of 
Selected Applications 

The following subsections describe in 
greater depth a group of applications 
selected from the preceding sum- 
mary. They all use some form of the 
delta rule or the backpropagation 
algorithm for adaptation and learn- 
ing. The fields of application are 
highly diverse, but the learning pro- 
cesses are remarkably similar. 

The telecommunications indus- 
try. Many neural network applica- 
tions are under development in the 
telecommunications industry for 
solving problems ranging from con- 
trol of a nationwide switching net- 
work to management of an entire 
telephone company. Other applica- 
tions at the telephone circuit level 
turn out to be the most significant 
commercial applications of neural 
networks in the world today. Mo- 
dems, commonly used for computer- 
to-computer communications and in 
every fax machine, have adaptive cir- 
cuits for telephone line equalization 
and for echo cancellation. Adaptivity 
is needed because each telephone 
line has its own individual character- 
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Many neural net applications are under development : 


in the telecommunications industry 
for solving control problems. 


istics, and these characteristics 
change over time. 

Echo on telephone lines, which 
would normally be tolerated with 
speech, is devastating to high-speed 
data transmission. Echo cancelling 
solves the problem by detecting the 
echo and adding an equal and oppo- 
site signal to the return path. The 


‘cancelling signal is generated by an 


adaptive transversal filter whose co- 
efficients (weights) are automatically 
adjusted by the LMS algorithm of 
Widrow and Hoft [32], also known as 
the delta rule in the field of neural 
networks. The adaptive filter makes 
use of what amounts to a single neu- 
ron. The first echo cancellers were 
developed at AT&T Bell Labs in the 
1960s by M. M. Sondhi and his col- 
leagues. Today they are everywhere. 

The first application of adaptive 
techniques in telecommunications 
was telephone line equalization by 
Robert W. Lucky at AT&T Bell Labs. 
Telephone channels, radio channels, 
and even fiber-optic channels can 
have nonflat. frequency responses 
and nonlinear phase responses in the 
signal passband. Sending digital data 
at high speed through these channels 
often results in a phenomenon called 
“intersymbol interference," caused 
by signal pulse smearing in the dis- 
persive medium. Equalization in data 
modems combats this phenomenon 
by filtering incoming signals. A mo- 
dem’s adaptive filter, by adapting it- 
self to become a channel inverse, can 
compensate for the irregularities in 
channel magnitude and phase re- 
sponse. 

The adaptive equalizer in Figure 1 
consists of a tapped delay line (a 
transversal filter) with a single adap- 
tive neuron connected to the taps. 
Deconvolved signal pulses appear at 
the weighted sum, which is quantized 
to provide a binary output corre- 
sponding to the original binary data 
transmitted through the channel. 
The LMS algorithm is used to adapt 
the weights. 
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Figure 2a shows the analog re- 
sponse of a telephone channel carry- 
ing high-speed binary pulse data. 
Figure 2b shows an “eye” pattern, 
which is the same signal after going 
through a converged adaptive equal- 
izer. Equalization: opens the eye and 
allows clear separation of +1 and —1 
binary data pulses. 

Active control of sound and vi- 
bration. A new area of application 
for adaptive and learning systems to 
active control of noise and vibration, 
has been developing during the last 5 
or 10 years. Passive control of noise 
would make use of thick walls and 
sound-absorbing materials and coat- 


ings, while passive control of vibra- ` 


tion would make use of shock ab- 
sorbers, damping materials and 
structures, and other methods of iso- 
lating and snubbing vibration. Active 
sound control uses adaptive tech- 
niques to generate antisound (equal 
and opposite) to cancel noise in a 
space or volume. Active vibration 
control uses adaptive techniques to 


Figure 1. Ada channel 
equalizer with decision-directed 
learning 


generate vibration to cancel existing 
vibration. 

Active vibration control in a car is 
seen in the following example: En- 
gine vibration coupling into the chas- 
sis through the four supporting en- 
gine mounts is cancelled by 
transducers shunting the engine 
mounts, which are driven so that 
equal and opposite forces are applied 
to the chassis. The transducer signals 
come from a set of adaptive filters, 
each utilizing a single neuron 
adapted by means of the “filtered-X” 
LMS algorithm [32]. 

Several companies have developed 
“electronic mufflers” which can re- 
place the conventional passive muf- 
flers in automobiles [23]. This is an 
example of active noise control. A 
tachometer on the engine generates 
pulses at the cylinder-firing rate. The 
tachometer signal is adaptively fil- 
tered, amplified, and fed to a small 
loudspeaker in the exhaust system. 
The loudspeaker generates anti- 
sound. The adaptive filter utilizes a 
single neuron that learns with the fil- 
tered-X LMS algorithm. The result is 
an engine that is at least as quiet as 
one with a conventional muffler. 
Additionally, the engine “breathes” 





more easily, resulting in more horse- 
power and better fuel efficiency. As 
described by Randy Barrett in the 
August 12, 1993, issue of Washington 
Technology, Noise Cancellation Tech- 
nologies (NCT) in a joint venture 
with Walker Manufacturing cur- 
rently has electronic mufflers under 
test in New York City and Montreal 
bus fleets, where they have already 
demonstrated a 2.5% improvement 
in fuel economy. According to the 
October 28, 1992, issue of the Elec- 
tronic Engineering Times, the first pro- 
duction vehicles with the NCT- 
Walker muffler should be available 
in 1996. A number of other automo- 
tive applications of the filtered-X 
LMS algorithm can be found in the 
proceedings of a conference on ac- 
tive control of sound and vibration 
held at Virginia Tech in April of 
1991. | 

Active noise cancellation is also 
being developed to reduce noise 
problems caused by heating and air- 
conditioning equipment, vacuum 
cleaners, emergency vehicle sirens, 
aircraft, lawn mowers, and industrial 
equipment. NCT now markets a $99 
noise-cancelling headphone called 
NoiseBuster. 

Beam control at the Stanford Lin- 
ear Accelerator Center. The Stan- 
ford Linear Accelerator Center 
(SLAC) is a complex of particle accel- 
erators operated by Stanford Uni- 
versity for the U.S. Department of 
Energy. Physicists from all over the 
world design and perform ехреп- 
ments there, 24 hours a day, 7 days а 
week. A 3-kilometer-long linear ac- 
celerator fires both positrons and 
electrons into the arcular arcs of a 
collider. A major challenge involves 
controlling the positions of the elec- 
tron and positron beams in the col- 
lider to within 2 microns in spite of 
unpredictable disturbances that take 
place in the accelerator (due to 
changes in temperature, barometric 
pressure, vibration, sensor noise and 
so forth). Collisions must occur in 
order for the physicists to do their 
work, and the probability of colli- 
sions depends on the accuracy of 
positioning the opposing positron 
and electron beams. 

The linear accelerator is divided 
into 20 sections. Each section has 
beam position sensors and control 
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magnets to deflect the beam. Con- 
ventional feedback systems are used 
with each section for beam control, 
and they greatly reduce the varia- 
tions in the beam position. Nonethe- 
less, the system could not achieve the 
required accuracy without adaptive 
noise cancelling. Each section was 
` equipped with a multi-input multi- 
output (MIMO) adaptive canceller, 
eight inputs, and eight outputs. This 
is equivalent to a neural network 
without nonlinearity. Adaptation was 
done by a MIMO form of the LMS 
algorithm. Prior to the installation of 
the new system, operators at the ac- 
celerator would frequently make 
frantic late-night phone calls for help 
in recovering from a problem. The 
system has been so robust and stable 
in the six months since the adaptive 
solution was implemented that the 
late-night phone calls have ceased, 
and no significant problems have 
occurred. (This work was performed 
by Thomas M. Himel of SLAC.) 

The truck backer-u . Vehicu- 
lar control by artificial neural net- 
works is a topic that has generated 
widespread interest. At Purdue Uni- 
versity, tests have been performed 
using neural networks to control a 
model helicopter [16]. In a much 
larger project, a full-sized self-driv- 
ing van named ALVINN (Autono- 
mous Land Vehicle In a Neural Net- 
work) complete with video camera 
"eyes" and an onboard "brain" made 
from four workstations has been 
developed and built at Carnegie- 
Mellon University [18]. ALVINN 
learned to drive by watching humans 
, drive and can drive long distances at 
normal highway speeds, negotiating 
through traffic without human inter- 
vention. The system is not yet per- 
fect, of course, so when ALVINN 
drives, a human is always present to 
take over the controls if something 
goes wrong. | 

We now consider a system less 
complicated and more easily de- 
scribed than ALVINN—that of a 
neural network which has learned to 
steer a computer-simulated truck 
and trailer while backing to a loading 
platform. A solution to this highly 
nonlinear control problem was ob- 
tained by self-learning. The inputs to 
the two-layer network are “state” 
variables: the angle and position of 
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the rear of the trailer and the angle 
of the cab (see Figure 3). The output 
of the neural network is the angle of 
the steering wheel. The work was 
done by Nguyen and Widrow [15]. 


"The learning algorithm they used, 


which is based on the famous back- 
propagation algorithm [21, 30, 31], is 
called ^ backpropagation-through- 
time. 

The truck was only allowed to back 
up. Backing was done as a sequence 
of small steps. On the scale of a real 
*18-wheeler," each step would be a 
distance of approximately one meter. 
The truck backs from its initial posi- 





Figure 2. Eye patterns produced 


by overlaying cycles of the re- 
ceived waveform: a. before adap- 
tive equalization; b. after adap- 
tive equalization. 


tion until it hits something and stops. 
The desired final state of the system 
involves having the rear of the trailer 
parallel to the loading platform and 
positioned at its center. The actual 
final state is compared with the de- 
sired final state, and the difference is 
a state error vector. After each 
backing-up sequence is completed, 
the final error vector is used to mod- 
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Figure 3. Truck, trailer, and load- 
ing dock 


Figure а. Plant and controller 


Figure 5. Training the neural net 
plant emulator 


ify the controller weights, so that if 
the truck is placed in the same initial 
position and allowed to retry the 
backup sequence, the new final-state 
error will have a smaller magnitude 
than before. 

Figure 4 is a diagram of the neural 


net controller steering the truck—a 
controller governing a “plant” repre- 
sented by the truck kinematics. To 
train the controller, an emulator of 
the truck kinematics is needed. This 
is a two-layer neural network trained 
by backpropagation as shown in 
Figure 5 to produce the same output 
states as the plant when both the 
emulator and plant have the same 
driving function. 

The controller is a two-layer neu- 
ral network trained as shown in 
Figure 6. The initial position or state 
of the truck, ду, is applied to the con- 
troller, which generates a single out- 
put, the steering wheel angle. Using 
this steering signal, the truck backs 
up a step. The process of using the 
controller to set the steering angle, 
and then backing a step is repeated 
until either the truck hits something 
or the number of time steps exceeds 
a predetermined constant. 

Backing from state to state is rep- 
resented by signals going through 
the layers of a neural net. The con- 
troller and emulator are each com- 
posed of two layers of adaptive neu- 
rons. Every backing step corresponds 
to signals going through four layers. 
By “unrolling” the control system’s 
feedback loop, the whole backup se- 
quence can thus be represented as 
the forward propagation through a 
giant feedforward neural network 
containing a number of layers equal 
to four times the number of time 
steps. In a process called backpropa- 
gation-through-time, the final-error 
vector is backpropagated through all 
the layers of this composite network. 

After each backup sequence, the 
backpropagation-through-time algo- 
rithm finds a gradient of the squared 
positional error of the truck’s final 
state with respect to the weights of 
the controller. This gradient is used 
to update the controller's weights by 
stochastic gradient descent. 

Once learning is complete, the 
truck is able to back up satisfactorily 
from almost any initial position, even 
"jackknifed," and even from initial 
positions that were not previously 
encountered during training. The 
controller's ability to react and re- 
spond reasonably to new positions is 
an example of generalization. An il- 
lustration of the functioning of an 
already-trained system is shown in 
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Figure 7. This is a laboratory exercise 
that could, in the future, have impli- 
cations for vehicle control. Large 
American trucking companies are 
seriously exploring this technology. 
At the present time, the truck backer 
serves as a visual demonstration of 
the capabilities of nonlinear net- 
works. This demonstration helped 
motivate development of the Intelli- 
gent Arc Furnacé controller de- 
scribed next. 

Steel making. An electric arc fur- 
nace is used to melt and process 
scrap steel. The heat energy comes 
from a three-phase power line of 
rather massive capacity (often 30 
. megawatts or more—enough electri- 
cal power for a city of 30,000 peo- 
ple). The three-phase line connects 
to a bank of step-down transformers 
to supply current for three elec- 
trodes that stick down into the fur- 
nace. The electrodes are made of 
graphite, are about one foot in diam- 
eter, and are about 20 feet long. 
Three independent servos control 
the depth of the electrodes into the 
furnace. 

When starting a new “heat,” scrap 


steel is loaded into the furnace, апа. 


the servos are activated to drive the 
electrodes down toward the scrap 
pile. When an arc is first struck, 
sparks fly, and the noise is deafening. 
One's first impression of this is that it 
is like Dante’s inferno. 

Because the cost of installing and 
operating a large arc furmace is so 
great, even small changes in effi- 
ciency have a tremendous impact on 
economics. The motivation for the 
development of “intelligent control” 
is clear. In this section we describe 
the Intelligent Arc Furnace control- 
ler, invented by Bill Staib of Neural 
Applications Corp. [28]. The figures 
in this section were supplied by the 
inventor. 

Figure 8 shows an arc furnace, its 
three-phase power system, and in- 
strumentation that provides signals 
useful for the control of the elec- 
trode servos. Currents and voltages 
in the system are sensed, digitized, 
and fed to a 486 PC that implements 
the neural control system. Numerical 
processing is performed by an 80- 
MFLOP Intel i860 microprocessor. A 
microphone placed near the furnace 
provides the computer with the 


Time-lapse picture 
of truck backing 
up to loading dock 


sounds of “Dante’s inferno.” From all 
the sensed variables, a state vector 15 
obtained. 

Figure 9a shows the training of a 
neural network emulator of the fur- 
nace. The idea is similar to that of 
Figure 5 for the truck backer. The 
emulator is used in the training of 
the controller or regulator, another 
neural network. Figure 9b shows the 
training of the regulator. The learn- 
ing algorithm is a variant of the back- 
propagation algorithm. It works in a 
similar way to the training process 
for a single stage of Figure 6 of the 
truck backer. 

The results with neural control 
thus far have been excellent com- 
pared with the control systems that 
commonly exist for arc furnaces. 
Consumption of electric power is 


Figure 6. Training the controller 
witn backpropagation (C = 
controller; E = emulator). 


Figure 7. Example of a truck 
backup sequence 


reduced by 5% to 8%; wear and tear 
on the furnace and the electrodes 1s 
reduced by about 20%; the power 
factor on the input power lines is 
brought closer to 1; and the daily 
throughput of steel is increased by 
10%. The neural controllers are 
being installed by Neural Applica- 
tions Corp. just as quickly as they can 
be produced. These improvements 
are reportedly worth millions of dol- 
lars per year per furnace. 


The Chemical Process Industry 
Pavilion Technologies, Inc. of Aus- 
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Figure 8. Arc furnace data acqui- 
sition system. Source: Courtesy 
of Bill Staib 
Reg and state 
. values for time 
N, N—1 | tin, Tex., has embedded neural net- 
works and fuzzy logic into their Pro- 
cess Insights package for chemical 
manufacturing and control applica- 
tions [4]. In this package, the user 
takes historical process data and uses 
Reg (М) – Regulator outputs it to build a predictive model of plant 
for time N. ! behavior. The model is then used to 
S (М) – Fumace state conditions change the control setpoints in the 
for time N. plant to optimize behavior. Pavilion 
Technologies is a spin-off of MCC, 
es editar where the original inn was done in 
1989 to 1990 by John Havener of 
Texas Eastman and Jim Keeler of 
MCC/Pavilion Technologies. In the 
l | | original application conducted at the 
Reg and state Neural Texas Eastman Facility, Longview, 
values for time пеон Tex., neural networks in the Process 
ч! Regulator Insights package produced setpoint 
changes that reduced by one-third 
the requirement of an expensive 
chemical additive needed to remove 
byproduct impurities during pro- 
duction. The facility produces plas- 


Reg (М) — Regulator outputs 





for time N. | | id tics and chemical intermediates such 

37 б< елисни condi с as aldehydes and olefins. Since that 
for time N. work was completed, the technology 

and Pavilion's Process Insights soft- 

Furnace/regulator ware has been used in nearly 200 


real-world applications, including 


; modeling and optimization of distil- 
Figure 9. Block diagrams of a. fur- 


nace emulator; b. furnace/regu- | anoa conan санк зла ере 
| ie Source: Courtesy of Bill trol of plastics production, modeling 


and control of impurity levels in boil- 
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ers. These applications have gener- 
ated tremendous paybacks, with sav- 
ings of some applications totalling 
millions of dollars per year in single- 
unit production facilities. Texas East- 
man, a division of Eastman Kodak, 
has been so satisfied with the results 
achieved by neural networks in the 
Process Insights package that they 
are currently encouraging the use of 
neural networks throughout their 
Longview plant. The success of the 
program is described in the April 29, 


1993 issues of the company newslet- 


ter Texas Eastman News. 

In making these applications, the 
first step is plant modeling or plant 
emulation. Typically, the plant has 
many inputs (such as pressures, tem- 
peratures, flow rates, and feed-stock 
characteristics) and one or more out- 


put parameters (such as yield, impu- | 


rity levels, variance). In Figure 10 an 


Controller — | 


adaptive neural network is used to 
model an unknown plant (ie. to 
learn the plant's dynamics from his- 
torical data). 

Once the plant emulator con- 
verges, it can be used to train the 
neural net controller. Figure 11 
shows how this is done. The error 
vector is the difference between the 
plant output vector and the desired- 
state vector. This error is backpropa- 
gated through the neural plant 
model to provide error signals for 
the adaptation of the weights of the 
controller. The controller weights 
are adapted by the backpropagation 
algorithm to minimize the sum of 
squares of the components of the 
error vector. Pavilion uses fuzzy logic 
in its Process Insights package to es- 
tablish constraints on some of the 
controlled variables. 

In most practical cases, it is not 





possible to use a controller as simple 
as that shown in Figure 11. This is 
because almost all physical plants 
have internal dynamics. The plants 
response to a control signal depends 
on both the current input to the 
plant and the current state of the 
plant. Any actions by the controller 
must therefore consider the state of 
the plant as well as its current input. 
A common solution involves incor- 
porating tapped delay lines at the 
emulator and controller inputs to 
allow both networks to form internal 
representations of the present state. 
With tapped delay lines incorpo- 
rated, Figure 11 then describes an 
increasingly popular form of open- 
loop control called nonlinear adap- 
üve inverse control Another ap- 
proach is to incorporate one or more 
feedback loops in the system to cre- 
ate a dynamic system like the truck 





Figure 10. Adaptive plant emula- 
non. Source: Courtesy of Jim 
eeler 


Figure 11. Using the plant model 
or emulator for Dackpropagation 
of error for training tne neural 
controller. Source: Courtesy of 
Jim Keeler 
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backer. The controller can then 
be trained by backpropagation- 
through-time. Rather than simply 
using backpropagation to train the 
emulator as done with the truck 
backer, some closed-loop systems use 
backpropagation-through-time for 
this purpose as well [30]. 

In Process Insights, the relation- 
ship between the control history and 
the plant’s state variables is deter- 
mined by using measured states to 
train a dynamic state estimator [13]. 
The state estimator is then added to 
Figure 11 between the controller and 
the memoryless emulator. Memory is 
also added to the controller, which is 
trained by backpropagating error 
signals through the emulator and 
state estimator. 

It is interesting to compare Fig- 
ures 6, 9, and 11. Very similar things 
are going on in the vehicle control 
system (the truck backer), in the arc 
furnace control system, and in the 
chemical process control system. An 
emulator is made of the process to be 
controlled, and the controller is 
adapted by backpropagating the sys- 


tem error through the emulator. 


This is a very powerful idea, and it 
leads to useful applications. The 
reader should be aware however that 
this is not the only means of neural 
control. Other approaches include 
radial basis functions, reinforcement 
learning, and CMAC for problems 
such as process control, robotic actu- 
ator control, and vehicular control 
[84]. 


Conclusion 

Neural network architectures will 
probably never be able to compete 
with conventional techniques at per- 
forming precise and well-defined 
numerical operations such as matrix 
inversions or Fourier transforms. 
However, there are large classes of 
problems that appear to be more 
amenable to solution by neural net- 
works than by other available tech- 
niques. These tasks often involve 
ambiguity, such as that inherent in 
handwritten character recognition. 
Problems of this sort are difficult to 
tackle with conventional methods 
such as matched filtering or nearest- 
neighbor classification, in part be- 
cause the metrics used by the brain to 
compare patterns may not be very 


closely related to those chosen by an 
engineer designing a recognition sys- 
tem. Likewise, because reliable rules 
for recognizing a pattern are usually 
not at hand, fuzzy logic and expert 
system (ES) designers also face the 
difficult and sometimes impossible 
task of finding acceptable descrip- 
tions of the complex relations gov- 
erning class inclusion. In trainable 
neural network systems, these rela- 
tions are abstracted directly from 
training data. Moreover, because 
neural networks can be constructed 
with numbers of inputs and outputs 
ranging into the thousands, they can 
be used to attack problems that re- 
quire consideration of more input 
variables than could be feasibly uti- 
lized by most other approaches. It 
should be noted, however, that neu- 
ral networks will not work weil at 
solving problems for which suffi- 
ciently large and general sets of 
training data are not obtainable. 

Other tasks, such as those per- 
formed by VeriFone’s Onyx Check 
Reader and by event detectors in 
particle colliders, can be solved suc- 
cessfully using more conventional 
approaches, but neural networks 
help provide solutions which result 
in less hardware. Faster response 
times, lower costs, and quicker de- 
sign cycles. Several applications are 
now taking advantage of the high 
speeds and low costs of various neu- 
ral network chips. 

Perhaps the most important ad- 
vantage of neural networks is their 
adapuvity. Neural networks can au- 
tomatically adjust their parameters 
(weights) to optimize their behavior 
as pattern recognizers, decision mak- 
ers, system controllers, predictors, 
and so forth. Self-optimization allows 
the neural network to "design" itself. 
The system designer first defines the 
neural network architecture, deter- 
mines how the network connects to 
other parts of the system, and 
chooses a training methodology for 
the network. The neural network 
then adapts to the application. Adap- 
üvity allows the neural network to 
perform well even when the environ- 
ment or the system being controlled 
varies over time. There are many 
control problems that can benefit 
from continual nonlinear modeling 
and adaptation. Neural networks, 
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such as those used by Pavilion іп 
chemical process control, and by 
Neural Applications Corp. in arc fur- 
nace control, are ideally suited to 
track problem solutions in changing 
environments. Additionally, with 
some “programmability,” such as the 
choices regarding the number of 
neurons per layer and number of 
layers, a practitioner can use the 
same neural network in a wide vari- 
ety of applications. Engineering time 
is thus saved. 

Another example of the advan- 
tages of self-optimization is in the 
field of ES. In some cases, instead of 
obtaining a set of rules through in- 
teraction between an experienced 
expert and a knowledge engineer, a 
neural system can be trained with 
examples of expert behavior. The 
neural net becomes, in a sense, a 
trainable ES. Although it would im- 
plement rules, the actual rules imple- 
mented would not be apparent. The 
system designer would not be dealing 
with rules explicitly. On the other 
hand, if precise and complete rules 
are available or obtainable, then one 
would do best to use a classical ES. 

This article has described only a 
small fraction of the commercial, 
industrial, and scientific applications 
of neural networks that exist today. 
The list is long and impressive and 
growing rapidly. There is no way to 
predict how widespread use of the 
technology will eventually become. 
However, based on the current ex- 
tent of the field, and the rapidity of 
its growth, it seems reasonable to 
expect that before the turn of the 
century, neural networks will be а 
household word and a part of every- 
day life. B 
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М odels of the brain and evolution 


Genetic and Evolutionary 
Algorithms Come of Age 


E... there were computers, there was thinking about the mind as a computer—as a machine. 


DAVID E. GOLDBERG 


F 


And in this way, computer science and engineering trace their roots to using natural exam- 
ples. Within these fields of endeavor, AI drew its initial inspiration from nature, and work on 
computer-simulated brains received the lion's share of the early attention. But even back then, 
natures other metaphor of adaptation planted a different seed that is now blossoming around 
the globe. Specifically, Darwinian evolution has spawned a family of computational methods 
called genetic algorithms (GAs) or evolutionary algorithms (EAs). 


These search procedures, based on 
the mechanics of natural selection and 
genetics, are finding increasing applica- 
tion to difficult search, optimization, 
and machine-learning problems across 
a wide spectrum of human endeavor. 
Although for some years these investi- 
gations have remained cloistered in 
universities and research institutes, 2 
new class of real-world applications is 
graduating from college and is moving 
into the computer rooms of industry 
and government, with repercussions 
starting to be felt from the factory floor 
to the community at large. 


What Are Genetic Algorithms?- 

GAs’ are search procedures based оп 
natural selection and genetics. There 
are many variations on these algo- 


b ‚ in remainder we will use the 
term to mean either evolutio от genetic 
algorithms. Historically, the еу 
has been associated with algorithms that use se- 
lection and mutation alone, while the term ge- 
netc has been associated with algorithms that 
use selection, mutation, recombination, and a 
variety of other nature-inspired mechanisms. 


rithms. For concrete discussion, we 
limit ourselves to the simple GA pre- 
sented elsewhere [7], a GA that pro- 
cesses ‘a finite population of fixed- 
length binary strings. In practice, bit 
codes, k-ary codes, real (floating- 
point) codes, permutation (order) 
codes, Lisp codes, and others have all 
been used with success. Each of these 
has their place, but here we examine 
a simple GA to better understand 
basic mechanics and principles. 

A simple GA consists of three op- 
erators: selection, crossover, and 
mutation. 

Selection is the survival of the fit- 
test within the GA. There are many 
ways to achieve effective selection, 
including ranking, tournament, and 
proportionate schemes, but the key 
notion is to атое preference to better indi- 
viduals. For example, in two-party 
tournament selection, pairs of strings 
are drawn randomly from the paren- 
tal population, and the better individ- 
ual places an identical copy in the 
maung pool. If a whole population is 


selected in this manner, each individ- 
ual will participate in two tourna- 
ments and the best individual in the 
population will win both trials. The 
median individual will typically win 
one trial and the worst individual 
does not win at all. 

Of course, for selection to function 
there must be some way of determin- 
ing what is good. This evaluation can 
come from a formal objective func- 
поп, or it can come from the subjec- 
tive judgment of a human observer or 
critic. As the tournament selection 
example makes clear, the primary 
requirement is for a partial ordering. 

If we were to do nothing but selec- 
tion, GAs would not be very interest- 
ing because the trajectory of popula- 
üons could contain nothing but 
changing proportions of strings con- 
tained in the original population. In 
fact, if run repeatedly, selection alone 
is a fairly expensive way of—with 
bigh probabiliry—filling a population 
with the best structure of the iniual 
population. 
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To do something more sensible, 
the algorithm needs to explore differ- 
ent structures. A primary exploration 
operdwpzmised in many GAs is cross- 
over, ађа simple, one-point crossover 


proceeds in three steps. First, two 


individuals are chosen from the pop- 
ulation using the selection operator, 
and these two structures are consid- 
ered to be mated. A cross site along 
the string length is chosen uniformly 
at random, and position values are 
exchanged between the two strings 
following the cross site. For example, 
Starting with the two strings 4 = 
11111 and B = 00000, if the random 
choice of a cross site turns up a 3, we 
would obtain the two new strings 
А' = 11100 and B’ = 00011 follow- 
ing crossover; these strings would be 
placed in the new population. This 
process continues, pair by pair, until 
the new population is complete, filled 
with "off-strings" that are con- 
structed from the bits and pieces of 
good (selected) parents. There are 
many other variants of crossover, and 
many claims are made by their adher- 
ents. However, the main issue is 
whether the operator promotes the 
successful exchange of necessary sub- 
structures [10]. 

Selection and crossover are sur- 
prisingly simple operators, involving 
nothing more complex than random 
number generation, string copying, 
and. partial string exchanges. Yet 
their combined action is responsible 
for much of a genetic algorithm's 
search punch. To understand this in- 
tuitively, we need only to think in 
terms of our own human processes of 
innovation. What is it we are doing 
when we are being innovative or cre- 
ative? Often we are combination no- 
tions that worked well in one context 
with notions that worked well in an- 
other context to form new, possibly 
better zdeas of how to attack the prob- 


lem at hand [6]. Similarly, GAs juxta- 


pose many different, highly fit sub- 
strings (notions) through the 
combined action of selection and 
crossover to form new strings (ideas). 

If selection and crossover provide 
much of the innovative capability of a 
GA, what is the role of the mutation 
operator? In a binary-coded GA, 
mutation is the occasional (low- 
probability) alteration of a bit posi- 
tion, and with other codes a variety of 


diversity-generating operators may 
be used. By itself, mutation induces a 
simple random walk through string 
space. When used with selection 
alone, the two combine to form a par- 
allel, noise-tolerant hill-climbing al- 
gorithm. When used together with 
selection and crossover, mutation acts 
as both an insurance policy against 
losing needed diversity and as a hill 
climber. 

Heated debates among evolu- 
попагіеѕ and genetic algorithmists 
consider the relative importance of 
this operator or that, but most of this 
discussion is misplaced because GAs— 
and their natural counterparts—are 
multifaceted. Simple statements like 
“GAs are hill climbers,” “mutation is 
the most important operator” or 
“crossover is the most important op- 
erator” are likely to be wrong, be- 
cause GAs are complex systems, be- 
having differently їп different 
portions of their phase space. The 
simplest GAs are discrete, nonlinear, 
stochastic, highly dimensional algo- 
rithms operating on problems of infi- 
nite variety. Not only does this get us 
into semantic difficulty, but because 
of this complexity, GAs are hard to 
design and analyze. However, just as 
the Wright brothers were able to de- 
sign the complex system we now rec- 
ognize as the airplane through an in- 
tuitive decomposition and the 
ruthless separation of subproblems 
[1], we, too, are able to design power- 
ful GAs using a similar methodology 
of invention [5]. 

This design methodology relies 
heavily on Holland's notion of sche- 
mata and building blocks (11, 12]. Sim- 
ply stated, schemata are similarity 
subsets (sets of strings that have one 
or more features in common), and 
building blocks are those schemata 
that are 1) consistently emphasized by 
selection and 2) respected and ex- 
changed by the genetic operators. 
Since Holland’s pioneering theories, 
much progress has been made in 
both experimentally verifying this 
building-block hypothesis and in follow- 
ing its design consequences. In par- 
ticular, we appear on the verge of an 
integrated theory of simple GA oper- 
ation [8, 10]. Moreover, one type of 
GA designed with strict adherence to 
building-block principles appears to 
give subquadratic results (in a prob- 


ably approximately correct sense) to 
large (| > 100 bits) problems with bil- 
lions of local optima [9]. More work is 
necessary to consolidate these find- 
ings and to spread them to the variety 
of codings in use, but the availability 
of well-grounded algorithms will 
prove important to practitioners as 
the stakes are raised and larger, more 
difficult applications are attempted. 


Why Use GAs in Applications? 
GAs can be attractive in applications 
work for a number of reasons: 


1. GAs can solve hard problems 
quickly and reliably. 

2. GAs are easy to interface to exist- 
ing simulations and models. 

3. GAs are extensible. 

4. GAs are easy to hybridize. 


One of the primary reasons to use 
GAs 1s that they are broadly compe- 
tent algorithms. Empirical work has 
long suggested this, but theory is 
catching up, and it appears that GAs 
can solve problems that have many 
difficult-to-find optima. Moreover, 
because GAs work via sampling, pop- 
ulations may be sized to detect a 
given degree of function difference 
with no more than a specified 
amount of error [8]. This can make 
GAs remarkably noise tolerant. 

Because GAs use very little 
problem-specific information, they 
are remarkably easy to connect to ex- 
tant application code. Many algo- 
rithms require a high degree of inter- 
connection between the solver and 
the objective function. For example, 
dynamic programming requires a 
stage-wise decomposition of the prob- 
lem that not only limits its applicabil- 
ity, but can require massive rear- 
rangement of system models and 
objective functions. GAs, on the other 
hand, have a clean interface, requir- 
ing no more than the ability to pro- 
pose a solution and receive its evalua- 
tion. Oftentimes, getting a good 
model is nine-tenths of the battle, and 
once that model is tested and cali- 
brated, the GA can be interfaced 
quite directly without additional diff- 
culty. Moreover, because of a GA's 
noise tolerance, discrete-event simu- 
lations and other noisy evaluators can 
be used directly as long as population 
sizing is performed to account for the 
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stochastic variations in the evaluation 
process [8]. 

Even simple GAs can be broadly 
capable, but real problems can pose 
unanticipated difficulties. When 
these arise, oftentimes there is a solu- 
tion from nature available to solve the 
problem. For example, in many Al 
problems the search spaces are highly 
multimodal, and the solution set may 
have multiple global solutions. In 
these cases, it is desirable to have the 
population converge to multiple op- 
шпа simultaneously. In nature, of 
course, there is no superspecies that 
uses all resources everywhere on the 
planet, but instead there are multiple 
species that occupy multiple niches, 
separated from one another by the 
obstacles imposed by geography or 
through the utilization of different 
sets of resources. In a GA the notion 
of niche and species has been stably 
imposed on the population through 
various modifications to the selection 
scheme [7], and this kind of extensi- 
bility via nature is useful in GA design. 

When nature does not call, it is 
often possible to use problem-specific 
information to help make a hybrid or 
knowledge-augmented GA. For ex- 
ample, many search domains have 
more competent local search heuris- 
tics than selection plus mutation, and 
getting the best answer in the shortest 
time often recommends combining 
the global perspective of the GA with 
the efficient local search of some 
problem-specific technique. There 
are also a number of ways that 
problem-specific information can be 
built into the operators or the cod- 
ings, and a number of these are dis- 
cussed in standard references [4, 7]. 


A Parade of Applications 

For some time GAs were mainly an 
academic's plaything, but lately there 
have been an increasing number of 
industrial-strength applications gain- 
ing national and international atten- 
tion. Here, we survey a sample of 
real-world GAs across a spectrum of 
problems from computer-aided engi- 
neering to finance, from criminal jus- 
tice to fiber-optic network design. Al- 
though the codings, operators, and 
problem structure of the different 
applications are different, recurring 
themes will emerge, and we will visit 
these at the end of our march. 


Products, Services, and Sources of 
Information 





enetic and evolutionary algorithms grew out of academic and re- 
| search Institutions, and today research activity is carried out at many 
locations, Including the following: University of Alabama, University 
of Alberta (Canada), University of California at San Diego, Colorado State Uni- 
versity, Dortmund University (Germany), George Mason University, University of 
Iliinols at Urbana-Champaign, Kyoto University Uapan), University of Michigan, 
U.S. Naval Research Laboratory А! Center (Washington, D.C.), University of New 
Mexico, The Rowland Institute for Science, University of Tennessee-Knoxville, 
Tsukuba University Uapan), and Stanford University. 

Commercially, a number of software packages are based on GA/EA 
technology: 


e Evolver, In its second release from Axcelis in Seattle, Wash., interfaces di- 
rectly with Microsoft Excel to permit users to create an application model 
in the spreadsheet. In this mode, Evolver adjusts spreadsheet cells geneti- 
cally, and objective function values are passed from Excel back to Evolver. 
In version 2.0, the GA Is a .dll engine that can be accessed directly from 
other Microsoft Windows applications. 

e MICTOGA is available from Emergent Behavior in Palo Alto, Calif., as a li- 
brary of С + + objects that Implement a simpie GA. Applications code must 
be written. іп C + + and interfaced with this library. 

e NeuralWorks Professional Il/Plus, from NeuralWare Inc. of Pittsburgh, 
Penn., has recently been outfitted with a Genetic Reinforcement Learning 
system. The system augments standard network training procedures by 
using a simple GA to avoid getting stuck at local optima. 


Additionally, a number of private consultants specialize in GA applications. 
Tica Associates in Cambridge, Mass., was started in 1990 by Lawrence Davis 
as a consultancy specializing in the application of GAs. Bolt, Beranek and 
Newman, inc. (BBN), devotes considerabie efforts on governmental and in- 
dustrial applications of GAs, particularty in the areas of scheduling and mili- 
tary applications. 

For those who would prefer to do their own hacking, a number of 
public-domain codes are available. Three books [4,7,14] contain sample 
codes tnat are readily available. Additionally, NASA's software distribution 
service COSMIC distributes a windows-orlented code called Splicer, which 
was developed through tne Johnson Space Center by Mitre Corporation, 
and a GA written in C Is available from the University of California at San 
Diego by contacting. the Internet address nici@cs.ucsd.edu. 

Started In 1985, the International Conference on Genetic Algorithms (ICGA) 
Is the longest-running GA/EA conference, and it Is held in odd-numbered 
years. A European conference, started in 1990, called Parallel Problem Solv- 
ing from Nature is held In even-numbered years. Speciaity conferences and 
workshops are popping up all over witn tne Workshop on the Foundations 
of Genetic Algorithms (FOGA, also in even-numbered years) being one of 
the most widely attended. : 

A number of journals devote considerable page space to GAs and EAs. 
Adaptive Behavior and Evolutionary Computation (both MIT Press) are two 
startup Journals that consider GA/EA-related topics. Complex Systems con- 
tains articies, largety of foundational concern, and The Annals of Mathemat- 
ics and Artificial Intelligence has published one special edition devoted to 
GAS and another is in the works. Two newsletters have followed GAs fairly 
closely: Advanced Technology for Developers, edited by Jane Klimasauskas at 
NeuralWare, Pittsburgh, Penn., and Release 1.0, edited by Esther Dyson, 
New York. Electronically, the GA list is the oldest and most widely read GA- 
related news list with over 1,800 recipients (subscribe at ga-list- 
requestesuno.alc.nri.navy.miD. Given the fast pace of GA developments, 
newsletters and electronic lists are almost essential to keep up with the 
latest. 
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Everyone Loves (а) CAD 

One of the first major commercial 
applications came together in Gen- 
eral Electric’s computer-aided design 
(CAD) system ЕпСЕМЕоц [16]. This 
system was designed to be a domain- 


independent tool, combining the 


speedy local search of a number of 
traditional (and local) numerical opti- 
mization tools, the convenience of 
expert systems for specifying design 
constraints and control information, 
and the more global perspective of a 
geneuc algorithm. The hybrid (or 
“interdigitized” in GE lingo) system 
can be interfaced to coordinate the 
activities of one or more domain- 
specific simulation or modeling 
codes, things as diverse as finite- 
element models, computational fluid 
dynamics codes, and discrete-event 
simulators. 

EnGENEous got its start in engine 
design, evolving from an expert sys- 
tem code called Engineous. An early 
test on a 100-variable portion of a 
larger high-bypass gas turbine design 
problem compared human perfor- 
mance, the expert system code alone, 
the GA alone, the GA initialized by 
the expert system, and the interdigi- 
üzed CA and expert system. All com- 
puter systems performed better than 
a human designer, with EnGENEous 
in interdigitized mode obtaining a 
0.92% increase in turbine efficiency 
over the human designer, who was 
only able to get 0.5% better than the 
starting design. To the uninitiated 
these numbers might seem small, but 
improvements in a mature field like 
gas turbine design are hard fought, 
and even modest gains in efficiency 
translate into real customer: savings 
and a significant competitive advan- 
tage for GE. An interesting aspect of 
the study was the effect of hybridiza- 
tion on computation time. The GA 
alone obtained a good solution, but 
required 30,000 function evaluations. 
When the GA was partially initialized 
using several expert system-derived 
designs, the time to good solutions 


` dropped almost by a factor of five to 
6,600 function evaluations. When the 


GA and the expert system were run 
iteratively the computational effort 
dropped to approximately 3,600 
function evaluations. In all cases, the 
combined systems performed better 
than either pure system acting alone. 


Since those early tests, En- 
GENEous has been updated to bring 
numerical optimization into the sta- 
ble of tools used in the design pro- 
cess, and the system has been used 
successfully in a number of applica- 
tion areas.. The gas turbine applica- 
tion has gone on to make the new 
Boeing 777 jet engine more efficient, 
and applications in electric-utility 
planning, hydroelectric generator 
design, and steam turbine design 
have been paying off handsomely. 
The latter application has been par- 
ticularly important to GE, because 
steam turbines are designed and built 
on a custom basis with manufacturing 
times as short as 12 months. This con- 
strains the design schedule to times as 
short as 2-3 weeks, and EnGENEous 
makes it possible to design more effi- 
cient nozzles, buckets, and other 
components through the integration 
of flow and resistance computer 
codes into the optimization process. 
In the past, such short lead times 
would only permit the examination 
of a few alternatives before a design 
was sent off to be manufactured. 


Never Forget a Face 

An application that is only a stone’s 
throw—or a police station—away 
from the real world is the Faceprints 
system developed in the psychology 
department of New Mexico State 
University (NMSU) [2]. It was once 


common for police artists to draw a 


suspects face from a witness's de- 
scription, and more recently trans- 
parency-based sets of facial features 
have been used for the same purpose. 
These systems and straightforward 
computer implementations of them 
depend on the witness's ability to jux- 
tapose individual eyes, mouths, and 
other facial features, but such a 
search process depends on the mind’s 
ability to recall facial features individ- 
ually, sometimes out of context. It is 
much easier for the mind to recognize 
similarity in a whole face that is close. 

The NMSU system taps into the 
mind's eye by having a GA generate 
20 faces on a computer screen. The 
witness rates each face on a 10-point 
subjective scale, and the GA takes that 
information and through normal se- 
lection, crossover and mutation, op- 
erators generate additional faces. The 
faces are generated from an underly- 


ing binary chromosome that maps 
subcodes for each of five facial fea- 
tures— mouth, hair, eyes, nose, chin— 
into their pictorial representation, 
and the picture is assembled and dis- 
played. 

In tesung the system, the NMSU 
team exposed subjects to a simulated 
crime and then asked them to use the 
system to reconstruct the criminal's 
face at varying times following the 
simulated crime. Figure 1 shows one 
result of one trial where a witness re- 
constructed the face three days fol- 
lowing the simulated crime. The suc- 
cess of the technique has led NMSU 
to apply for a patent, and re- 


finements to the technique have 


almost reached the 


commercialization. 


point of 


Big Bucks from Yen (or Pounds 


or Marks) 

Various AI systems have been used in 
financial applications, and GAs are no 
exception. A startup in Santa Fe, 
N.M., called the Prediction Com- 
pany, has developed a set of time- 
series prediction and trading tools for 
currency trading in which GAs play 
an important role. 

In parücular, rule-like structures 
are evolved that have left-hand sides 
that are matched when time-series 
data enter specified regions [15], and 
the right-hand sides predict whether 
the time-series will go up or down. A 
collection of these rule structures 
form a population, and these are 
trained against real financial data, 
using an objective function that tries 
to reduce the mean-squared predic- 
поп error as it tries to increase the 
confidence of prediction. 

In financial circles, one measure of 
investing efficacy is the Sharpe ratio, 
roughly the ratio of return to risk. In 
one currency-trading application, a 
known group of 20 currency traders 
was found to have Sharpe ratios in 
the range 0.3-1.0 with most traders 
in the range 0.3-0.7. Tests with the 
Prediction Company's technique 
demonstrated ratios as good as the 
best of the known currency traders. 
Recently, O'Conner Associates, a Chi- 
cago-based affiliate of Swiss Bank 
Corporation, entered ‘into a long- 
term agreement to provide financing 
for Prediction Company's operations 
and trading. 
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Poetry to Our Ears 

Interoffice fiber-optic networks аге 
already a big business at US WEST, 
but an evolutionary algorithm devel- 
oped in the Operations Research 
Modeling group [3] promises to 
make network additions faster and 
cheaper. 


SONETs (synchronous optic net- ` 


works) involve multiple rings of inter- 
connected fiber-optic cable, where no 
more than 48 nodes are permitted 
per ring. Small networks at US 
WEST have been designed using tra- 
ditional operations research tech- 
niques, but the design of the expan- 
sion of larger networks has been 
impractical by such methods. Design 
of such large networks was done by 
hand and relied on the designer's in- 
tuition and experience. As an inter- 
mediate step, these efforts were aug- 
mented by the use of commercially 
available network simulation tools. 
More recently, the evolutionary algo- 
rithm developed at US WEST has 
paid off dividends by allowing large 
networks to be designed efficently 
and quickly. The specific technique 
uses a relatively small population 
. with mutation-like operators that ei- 
ther expand an existing ring or start a 
new ring. The new alternative net- 
work is evaluated by running multi- 
period simulations on the commer- 
cial code. Networks that meet 
performance constraints are then 
compared on the basis of cost, with 
better networks surviving and worse 
networks dying off. 

The tool was first tried in May 
1992, and network design time has 
been cut from two person-months to 
roughly two person-days. Cost sav- 
ings are estimated in the range from 
$1 million to $10 million per design, 
and with 20 designs required over 
the next six to eight years, total sav- 
ings could top the $100 million mark. 


Other Applications 

Space prohibits the fuller exploration 
of many of the interesting applica- 
tions that are making or will soon 
make their mark. Here, we briefly 
survey a potpourri of applications in 
a number of different areas. 

Keeping simulated tob guns on time. 
Bolt, Beranek and Newman Inc. 
(BBN), of Cambridge, Mass., has cre- 
ated a GA-based scheduling algo- 


rithm for the two System Integration 
Test Station Laboratories at the U.S. 
Мауу 5 Point Mugu Naval Airbase [4]. 
These laboratories provide simulated 
environments for F-14 equipment 
and software testing, and the volume 
of testing demands every hour be 
used to the fullest. The BBN sched- 
uler has been in use for a year and a 
half, and results have been good. The 
automatic system replaces a retir- 
ing human scheduler, and it captures 
the hard and soft constraints that 
were used in the manual scheduling 
process. 

Tanks a lot. Hughes Missile Systems 
Company, Canoga Park, Calf, is 
applying the idea of genetic pro- 
gramming [14] to infrared image 


Figure 1. The Faceprints system 
evolves a face through interac- 
tion with the witness to a crime. 
The images shown here are from 
laboratory tests with human sub- 


jects. On the left is the actual sim- 


ulated criminal, and on the right 
is the Faceprints-generated 
image constructed three days 
after the simulated crime. 


Figure 2. Eight tanks are shown 
in light clutter in a typical infra- 
red image. The Hughes genetic- 
programming system is able to 
generate good tank detectors Dy 
recombining standard arithme- 
tic operators and sets of primi- 
tive features. 
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target discrimination. In genetic 
programming, restricted Lisp expres- 
sions form the chromosome, and 
tree-based crossover recombines dif- 
ferent structures while selection 
chooses the better ones. In the 
Hughes system [18], code composed 
of arithmetic operators and a logical- 
if operator are applied to inputs of 20 
terminals, consisting of statistical fea- 
tures of a potential target. If the ex- 
pression evaluates to a real value 
greater than zero, the image segment 
is taken to be a target. In tests against 
difficult data similar to that of Figure 
2, genetic programming beat a neu- 
ral network trained with backpropa- 
gation learning and a binary-tree 
classifier. The project has been so suc- 
cessful it is being implemented in 
hardware. 

Fuzzy genetics. The Tuscaloosa of- 
fice of the U.S. Bureau of Mines is 
moving a GA-adapted, fuzzy logic 
controller (GA-FLC) from the labora- 
tory to the shop floor in a mineral 
recovery application. OfF-line experi- 
ments [13] have shown that GA-FLCs 
are able to control pH and other rele- 
vant variables better than standard 
control algorithms to the point where 
industrial trials are warranted. Initial 
installation will begin shortly at Cliffs 
Mining Company in Ishpeming, 
Mich., and Kennecotte Mining in Salt 
Lake City, Utah, with improvements 
in recovery efficiency and consistency 
expected shortly thereafter. 

Lets get geophysical. Solving geo- 
physical inverse problems is essential 
in oil exploration, and Advance Geo- 
physical, Englewood, Col., is using a 
hybrid GA and gradient descent to 
solve what is called the static correction 
problem [17]. In seismic surveys, one 
of the first and most important cor- 
rections that is made to observed data 
accounts for the effect of reflection 
travel times off irregularities in the 
surface topography and the thickness 
and travel times in the low-velocity or 
weathered layer. In this application, 
the hybrid GA-gradient searcher is 
made available as part of a larger 
seismic-survey software package. 

This quick applications rundown 
reveals a surprising breadth of appli- 
cation area as well as the use of differ- 
ent codings, operators, and objective 
functions. On the other hand, the 
applications are surprisingly similar 


in their underlying motivation and 
approach. Many of the applications 
demonstrated a fairly rigid separa- 
tion between model and searcher, 
and this is likely to be the case in 
many other applications as well. Also, 
many of the applications had useful 
heuristics and local search techniques 


and found it was useful to bring those ` 


on board to improve convergence 
times. Perhaps most importantly, 
each of the applications came to GAs 
for performance, not for fun. Al- 
though academic studies can afford 
the luxury of using technology for its 
own sake, practitioners cannot, but 
even with a nuts-and-bolts attitude 
like this, more and more applications 
are turning to GAs to solve hard 
problems that have long awaited 
computer solution. . 


Crystal Ball Not Needed 
With an increasing number of practi- 
cal applications in existence, the fu- 
ture of GAs seems fairly bright. Al- 
though this article has concentrated 
largely on activities in North America, 
interest and activity is strong in Eu- 
rope (particularly the UK and Ger- 
many) and Japan. On a recent trip to 
Japan, I witnessed research and de- 
velopment activities in a number of 
major corporations, including NEC, 
Mitsubishi Electric, and Fujitsu, and I 
heard widespread rumors of patent 
and preemptive trademarking activ- 
ity for GA-based products. 

All applications efforts are aided by 
a growing body of practical theory 
that helps us understand what GAs 
process, how they can process it bet- 
ter, and how long and how close we 
should expect to come to global or 
near-global solutions in difficult 
problems. Tests on large-scale prob- 
lems with billions of local optima are 
showing us that GAs can solve prob- 
lems much harder than was once 
thought, and the convergence can be 
obtained more quickly and more reli- 
ably than was previously possible. 

With practical successes growing in 
number and with theoretical results 
paving the way for practical, yet well- 
grounded applications, nature’s fa- 
vorite search algorithm may soon 
become industry's as well. It is be- 
coming clear that GAs and EAs are 
changing our vision of what it is pos- 
sible to design and operate. As knowl- 


edge of this new intellectual leverage 
becomes more widespread, no crystal 
balls are needed to suggest that the 
fuller realization of robust computer 


aids leaves us standing on the thresh- 


old of what might soon be called a 
golden age of adaptation. 
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Financial Projection 
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SLC 


• What Crystal Ball Does 


• How Crystal Ball uses Monte Carlo 
| 
Simulation 


è "Futura Apartments" Spreadsheet Tutorlal 
« "уроп Research" Spreadsheet Tutorlal 

• Defining Assumptions 

• Selecting and Defining Distributions 

+ Running a Simulation 


e Interpreting the Results 


In this chapter are two tutorials, one short, 
one long, providing an overview of Crystal 
Ball's features. The first tutorial, the "Futura 
Apartments" spreadsheet, simulates proflt/ 
loss projections from apartment rentals. This 
tutorial 15 ready to run so you can quickly see 
how Crystal Ball works. If you work regularly 
with statistics and forecasting techniques, 
this may be all the Introduction you need 
before running your own spreadsheets with 


Crystal Ball. 


The second tutorlal, the "Vislon Research" 
spreadsheet, gives you a chance to enter data 
and set up a complete simulation for a major 
corporate expenditure decision. As you work 
through the second tutorial, do not worry 
about making mistakes; recovery Is a$ easy as 
backing up and repeating the steps. If you 
need additional help, refer to Appendix A, 


Error Recovery, or the Crystal Ball Help menus. 


Now, spend a few moments learning how 
Crystal Ball can help you make belter 
decisions under conditions of uncertainty. 


In this Chapter x 





What Crystal Ball Does 


Glossary Term: 
Spreadsheet Model - 
Any spreadsheet that 
represents an actual or 
hypothetical system or 
sel ol relationships. 


Glossary Term: 
Assumption - 


An estimated value or inpul 


to a spreadsheet model. 


Glossary Term: 

Monte Carlo SImulatlon - 

A system which uses random 
numbers to measure Ihe 
ellects of uncertainty ina 
spreadsheet model. 


Crystal Ball extends the forecasting capability of your spreadsheet 
model and provides the information you need to become a mor 
accurate, efficlent and confident decision-maker, As a spreadshee 
user, you know that spreadsheets have two major limitations: 


• You can only change one spreadsheet cell at a time. As a result, 


exploring the entire range of possible outcomes Is next to impos- 
sible; you cannot realistically determine the amount of risk that Is 
Impacting your bottom line. 


пума |" analysis always results In single-point estimates which 
do not indicate the likelihood of achieving any particular out- 
come. While single-point estimates may tell you what Is possible, 
they will not tell you what is probable. 


Crystal Ball overcomes both of these limitations and takes the guess- 
work out of spreadsheet analysis by providing fast and accurate results 
in minutes: 


* You can describe a range of possible values for each uncertain 


cell In your spreadsheet. Everything you know about each 
assumption and how It affects your result 15 expressed all at once, 


Using a process called Monte Carlo Simulation, Crystal Ball 
displays your results in a forecast chart that shows the entire 
range of possible outcomes and the likelihood of achieving each 
of them. In effect, Crystal Ball moves you beyond "what-If^ 
scenarios by providing an accurate statistical picture of the range 
of possibilities associated wlth your assumptlons. 


The best way to quickly understand this process Is to start Crystal Ball 
and work on the first tutorlal: the "Futura Apartments" spreadsheet, 


Start Crystal Ball. 


—— nd 
Crystal Ball Note: Directions for installing and starting Crystal Ball are in 
the “How to Install Crystal Ball” section at the front of thls manual, 


2. Open the "Futura Apartments" spreadsheet flle, FUTURA.ALS 


(Excel) or FUTURA. WK4 (Lotus 1-2-3), from the Crystal Ball 
Examples directory. 
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The "Futura Apartments" spreadsheet Is displayed. 


. 1 У 
НИНА ПЕ "oet ш 
De уыл е URINE 
+ eT MS T ЕЕ ЕЕ. 1 | DINE Ld к 0454] ELP VN Erste reos ERAS ЕЛ 

у МА ид Ју | 


Futura Apartments Bug. 
р m 


Number of Rental Unite 35 
Run par Unii 9500.00 


Monthly Expenses $15,000.00 — 
Profit or Loss $2,500.00 


eke al и = |= 
mete чы tee E 


ET PU 


RTF 
Ju 


| he Futura Apart- 
hls example, you are a potential purchaser of t 

eue бана, you have researched the situation and created the 
above spreadsheet to help you make a knowledgeable decision. Your 
work has led you to make the following assumptions; 


• $500 per month Is the going rent for the area. 


• The number of units rented during any given month will be 
somewhere between 30 and 40, 


| 000 per month for the 
ə Operating costs will average around $15, 
entire complex, but may vary slightly from month to month, 


assumptions, you want to know how profitable the 
apie "be fos variou а рабое of rented units am oe 
ing costs. As useful as spreadsheets are, this would be шс o nol 
determine using a spreadsheet alone. The last two bus qs d и 
be reduced to single values as required by the spreadsheet ie 6 
would need to spend a great deal of Ште working through “what- k, 
scenarios, entering single values and recording the results, to у о 
all the combinations, Even then, you would likely be left wit a 
mountain of data Instead of the overall profit and loss picture. 


With Crystal Ball, this kind of analysis 15 easy. 


1. Choose Run from the Run menu on the menu bar. 


tion In the “Futura 
Crystal Ball runs a simulation for a situation 
еВ" spreadsheet and displays a forecast chart as It is 
being created. 


16 Crystal Ball User Manual 





Glosiory Term: 

lterallon айо Trial - 

A three-step process in 
which Crystal Ball 
generates random 
numbers for asiumption 
cells, recalculates the 
ipreadsheet modells), and 
displays the results in a 
Forecast Chart. 


Glossary Term: 
Probability - 
(Classical Theory) The 
likelihood of an event 
occuring 


After the simulation has run for at least 200 iterations, as displayed In 
your spreadsheet's status bar; 


2. Choose Stop from the Run menu on the forecast chart (Excel) or 
on the menu bar above the forecast chart (Lotus 1-2-3). 


Excel Note: If the forecast chart disappears behind Excel's window when a 


simulation is minning, you can bring it back to the front by pressing Alt- 
Tab. 


va 


š Forecast: Protit or Loss | 
Edit Preferences View Run Неј; | 
Frequency Chart 200 Trials Shown 


Ё. 


($2,000.00) $250.00 $2,500.00 $4,750.00 $7,000.00 
Monthly rental income 





The forecast chart reveals the total range of profit and loss outcomes 
predicted for the "Futura Apartments" scenarlo. Each bar on the chart 
represents the likelihood, or probability, of earning a given income. 
The cluster of columns near the center Indicates that the most likely 
Income level Is between $250 and $4,750 per month. Crystal Ball 15 
also forecasting that the worst case Is a $2,000 loss and the best case 15 
nearly a $7,000 galn. 


Now, you can use Crystal Ball to determine the statistical likelihood 
of making a profit. 


1. Press the Tab key twice or select the left edit box on the forecast 
window, 


2. Type 0 and press the Enter key. 
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} | bability of an 
The value in the Certainty box changes to reflect the pro | 
Income level ranging from 50 to positive Infinity—the probablilty of 
making a profit. With this Information, you are in a much better 
position to make a decision on whether to purchase the Futura 
Apartments. In this case, there 1s 90,5596 chance that you will make a 
profit, as shown below: 





Forecast: Profitar Loss 
Preferences Мем Run | Help 
Frequency Chart 









200 Trials Shown 
20 š 





=< = m = == = = = = = = = = = 
== =< = w m= m= = = = = s 


| 000.00 $250.00 $2,500.00 $4,750.00 $7,000.00 
он Monthly rental income 


M[EXE | certainty [056 ]% [бшу ° |] 





How Crystal Ball Uses Monte Carlo Simulation 


Glossary Term: 

Random Number - 

A mathematically selected 
value which is generated 
(by a formula or selected 
[rom a table) to conform 
lo a probability distribu: 
tion. 


Glossary Term: 
Random Number 
Generator · 

А inethod implemented 
in à compuler program 
that is capable ol 
producing a series of 
independent, random 
numbers. 


Most real-world problems involving elements of uncertalnty are too 
complex to be solved by strict analytical methods. There are simply 
too many posslble combinations of Input values to calculate every 
possible answer. 


Monte Carlo simulation is an efficlent technique for analyzing these 
types of problems, Jt is а simple technique that requires only a 
random number table or a random number generator on a com- 
puter. Instead of calculating all possible combinations of Input values, 
Monte Carlo simulation randomly chooses a relatively small ши 
of values as Inputs to the problem to generate а very good appro si 

Hon of the answer. During a simulation, Crystal Ball uses Мат a 0 
simulation to randomly create Inputs to your problem that look li " 
real-life possibilities, calculate the results, and plot them on a grap i 
Random numbers are distributed according to range estimates you 
define for the assumption cells. 
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Generate 
Random 
Numbers for 
Astumption 
Cells 


Calculate 
Entlre 


Spreadsheet 


Display 
| Results In à 
Forecast 
Chart 





А Trial or Iteratlon 


The spreadsheet 15 recalculated to produce results for the forecast cells. 
Crystal Ball charts the forecast results In an easy-to-understand 
graphical format (forecast chart). As the numbers change In the 
assumption cells, the values in the forecast cells change, and the 
forecast chart displays these values. 





Statistical Note: An Assumption Is an Input value, A Forecast 15 an output 
value. 


This Is an Iterative process which continues until either: 


° All of the trlals specifled for the simulation have been completed, 
or 


* The simulation Is stopped manually, 


Keep In mind that the spreadsheet model can only approximate a 
real-world situation., When you bulld your own spreadsheet models, 
you will need to carefully examine your problem and continually 
refine the models until they reflect your real-world situation as closely 
as possible, | 


Crystal Ball also provides statistics that describe the forecast results. 
These are presented In detall In Chapter 2, Understanding the Terminol- 
ogy, and Chapter 4, Interpreting the Results. 


When you are ready to move on to the next tutorlal, 


Choose Reset from the Run menu on the forecast chart. 
2. Click OK In the dialog box that Is displayed. 

The simulation Is reset and the forecast window disappears. 
3. Close the "Futura Apartments" spreadsheet. 


The "Vision Research" Spreadsheet 


The тета пдег this chapter contalns a tutorial for the “Vision Re- 
search" spreadsheet. This tutorial provides a more realistic situation to 
let you examine Crystal Ball's features [n greater depth. However, If 
you feel comfortable running Crystal Ball now, you may wish to turn 
to Chapter 2, Understanding the Terminology, for some background on 
Crystal Ball terminology. Then you can read Chapter 3, Setting Up and. 
Running a Simulation, and start analyzing your own spreadsheets. 
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į earch” spreadsheet models a business situation filled 
with LE. Vision Research has completed prem develop- 
ment of a new drug, code named ClearView, that Is designe | КА 
correct nearsightedness. This revolutlonary new product cou d 3 
completely developed and tested in time for release nen M Е y r 
FDA approves the product. Although the drug works = А а 5n i 
patients, the overall success rate is marginal, and Vision Re | 
uncertaln whether the FDA will approve the product. 


Vislon Research will use Crystal Ball to help declde whether to шр 
the project or proceed to develop and market this ке sty | rug. 
The ClearView project Is a multimillion dollac risk, Crystal Ba | 5 M 
powerful decision-support program designed to take the mystery О 


of decisions like thls. 
To see how Crystal Ball works In a typical business decision: 


IN. WK4 (Lotus 1-2-3) 
1, Open the VISION.XLS (Excel) or VISION | 
тед from the Crystal Ball Examples subdirectory. 


The “Vision Research" spreadsheet for the "ClearVlew Project” Is 
displayed. 


RIS 
ШУКР ОЕ ЗЕ г 


VISIONES 


Suggestad 
ClearView ii Distributions: 





21 Coste (In millions]: € 
Development Cos! of ClaarView lo Usta 







Chinn 
Татра 





ra н па вам 


i9. 








= 





s „Матка! Study (In millinnal: ; 
Sri Parsons in U.S. with Haarsightadness Today 
ighladnass 
ghtadness After Опе Year 40.0 






Take a look at the "Vision Research" spreadsheet on your s ale 
spreadsheet models the problem that Vision Research 15 try ng | 
solve. It Includes value cells and formula cells. Value on conia В. д 
numeric values; formula cells contaln formulas that refer to the 


cells. 
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Defining Assumptions 


Glosory Term: 
Probability Ори ћи оп - 
A set of all posible events 
and their associated 
probabilities. 


In Crystal Ball, you define an assumption for a value cell by choosing 
a probability distribution that describes the uncertainty of the data 


In that cell. You select from the 16 distribution types In the Distribu- 
tlon Gallery. 


How do you know which distribution type to choose? This portion of 
the tutorial has been designed to help you understand how to select a 
distribution type based on the answer you are looking for. In the 
following examples, you will select the assumption cells in the 
"Vision Research" spreadsheet and choose probability distributions 


that most accurately describe the uncertaintles of the ClearView 
project. : 


This tutorlal also explains the reasons for choosing a particular 
distribution for each assumption. Detailed descriptions of how to 
select distributlons are In Chapter 2, Understanding the Tenninology, 
and Chapter 3, Setting Up and Running a Simulation. 


Defining Testing Costs: The Uniform Distribution 


So far, Vislon Research has spent $10,000,000 developing ClearView 
and expects to spend an additlonal $3,000,000 to $5,000,000 to test 
It, based on the cost of previous tests. For this varlable, "testing costs," 
Vision Research thinks that any value between $3,000,000 and 
$5,000,000 has an equal chance of being the actual cost of testing. 


Using Crystal Ball, Vision Research chooses the Unlform distribution 
to describe the testing costs. The Uniform distribution describes a 
situation where all values between the minimum and maximum 


values are equally likely to occur, so this distribution best describes 
the cost of testing ClearView. 


Once you choose the correct distribution type, you are ready to define 
the assumption cell. 


To define the assumption cell for testing costs: 


1. Click cell C5. 
2, Choose Deflne Assumption from the Cell menu. 


Crystal Ball displays a dialog box showing the Distribution 
Gallery. 
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Assumptlon Name 


22 


enden 


Cugtom 


ru 





3. Click the Uniform Distribution. 


4. Click OK. 


Crystal Ball dlsplays a dialog box showing the Uniform distribu- 
tion you chose for C5. 











cel En: Uniform у топ 


Assumption Мате: [Testing Costs __ | | 










$36 $36 | $4.0 $4.2 $4.4 
ds (zug | | Max [44 — | | 


Parameters 









| he spreadsheet, the 
е cell C5 already has a name next to lt on t 
EY is displayed in the dialog box. Use DM паше e TG "и 
| th ssig 
ing a new one. Also, notice that Crystal Ball a 
pike distribution. The method Crystal Ball uses to assign these 
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default values to each distribution is explained in Appendix E, Default 
Names and Distribution Parameters. 


The Uniform distribution has two parameters—minimum and тах!- 
mur. Vision Research expects to spend a minimum of $3,000,000 
and a maximum of $5,000,000 on testing. Use these values in place of 


the defaults to specify the parameters of the Uniform distribution in 
Crystal Ball, as described in the following steps: 


5. Type 3 In the Min box, (Remember, the numbers on the spread- 
sheet represent milllons of dollars.) 


This represents $3,000,000, the minimum amount Vislon Re- 
search estimates for testing costs. 


6. Press Tab and type 5 In the Max box. 


This represents $5,000,000, the maximum estimate for testing 
costs, 


7, Click Enter, 


The distribution changes to reflect the values you entered. 


Assumption Name: | Testing Costa і 1 


The distribution 
changes to reflect the 


$45 150 
values you entered 





If you entered the values correctly, your screen looks like the example 
above. If you think you made a mistake, repeat steps 5 through 7. 
Later, when you run the simulation, Crystal Ball will generate random 


values for cell C5 that are evenly spread between 3 and 5 milllon 
dollars. 


8. Click OK to return to the spreadsheet, 
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Defining Marketing Costs: The Triangular 


Distribution. 


Vision Research plans to spend a sizeable amount marketing 


ClearView if it Is approved by the FDA. They expect 


to hire a large 


sales force and kick off an extensive advertising campaign to eco 
the public about this exciting new product. Including sales comm 


sions and advertising costs, 
$12,000,000 and $ 18,000,0 


Vislon Research expects to spend between 
00, most likely $16,000,000. 


Vision Research chooses the Triangular distribution to describe 
marketing costs because the Triangular distribution describes a 


situation where you can е$ 
likely values to occur. 


То define the assumption cell for marketing costs: 


1. Click cell Có. 
2. Choose Define Assumption from the Cell menu. 
Crystal Ball displays the dialog box for the Distribution Gallery. 


3. Click the Triangular Distribution. 
4, Click OK. 


Crystal Ball displays a dialog box showing the Triangular distribu- 


tion you chose for C6. ' 
у celici Triangular Distribution | 


Assumption Name: |Marketing Coote 


$144 $152 $16.0 $168 — $75 


„(а __ 


timate the minimum, maximum, and most 


Now speclfy the parameters for the Triangular distribution. As you 
can see In the example above, the parameters for the Trlangular 
distribution are different from those specifled earller for the Uniform 
distribution, The Triangular distribution has three parameters— 
minimum, maximum, and likeliest. The following steps explain how 
to enter the parameters of the Trlangular distribution: 


5. 





Type 12 in the Min box, 


This represents $12,000,000, the minimum amount Vislon 
Research estimates for marketing costs. 


Press Tab to access the Likellest box. If It does not contaln the 
value 16, type 16. 


This represents $16,000,000, the most likely amount for market- 
Ing costs, 


Press Tab and type 18 In the Max box. 


This represents $18,000,000, the maximum estimate for market- 
Ing costs, 


Click Enter. 


The distribution changes to reflect the values you entered, 


$120 $135 $15.0 $165 $80 _ 
4 {818.0 


b [812.0 ___ 
а [5120 | мох [BET 


an L] Vm 
i VN F J A ДЫ [4 


Min Like 
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When you run the simulation, Crystal Ball will generate random 
values that center around 16, with fewer values near 12 and 18, 


9. Click OK to return to the spreadsheet, 
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Defining Patients Cured: The Binomial Distribution 


Before the FDA will approve ClearView, Vision Research must conduct 
a controlled test on a sample of 100 patients for one year. The FDA 
has stipulated that they will approve ClearView if It completely, 
corrects the nearsightedness of 20 or more of these patlents ө 10и 
any significant side-effects. In other words, 20% or more of bi m 
patients tested must show corrected vision after taking ClearView to 
one year. Vision Research Is very encouraged by their preliminary 
testing, which shows a success rate of around 25%. 


For this variable, “patients cured," Vision Research only knows that 
thelr preliminary testing shows a cure rate of 2596. УЛ png 
meet FDA standards? Using Crystal Ball, Vislon Research chooses the 
Binomial distribution to describe the uncertainties In this situation 
because the Binomial distribution describes the number of successes 


(25) In a fixed number of trials (100). 
To define the assumption cell for patlents cured: 


1. Click cell C10. 
2. Choose Define Assumption from the Cell menu. 
Crystal Ball displays the Distribution Gallery dialog box. 


3. Click the Binomial Distribution. 


4, Click OK. 
Crystal Ball displays the Binomial! distribution (notice that the default 
value for the probability parameter 15 0.5 or 50%). 


5 50 74 


«(aay —] 
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The Binomial distribution has two parameters—probability (prob) and 
trlals. You know that Vision Research experlenced a 25% success rate 
during preliminary testing, so use the value .25 for the probability 
parameter to show the likellhood or probability of success 


Crystal Ball Note: All probabilities can be expressed either as decimal 
fractions between O and 1, such as .03, or as whole numbers followed by 
the percent sign, such as 3%, 


You also know the FDA expects Vision Research to test 100 people, so 
use the value 100 for the trlals parameter. The following steps explain 
how to enter these parameters In the Binomlal distribution. 


5, Type ,25 in the Prob box, 


This represents the 25% Ilkellhood or probability of successfully 
correcting nearsightedness. 


6. The Trials box should contain the value 100, If it does not, press 
Tab and type 100 In the Trlals box. 


This represents the 100 patients In the FDA test. 
7. Click Enter. ' 
The distribution changes to reflect the values you entered: 


ЕНЕН inomial Di stribution — 


Assumption Name: [Patlents Cured > | 


03 


26 50 


А mr «im ] 


Ei 





When you run the sImulatton, Crystal Ball will generate random 
Integer values between 0 and 100, sImulating the number of patlents : 
that would be cured In the FDA test. ; 


B. Click OK to return to the spreadsheet. 
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Defining Growth Rate: The Custom Distribution 


Vision Research has determined that 40,000,000 people In the United 
States are currently afflicted with nearsightedness, and that an 
additional 0% to 5% will develop this condition during the year In 


which testing will take place. 
However, the marketing department has learned that there Is a 25% 
chance that a competing pro 
This product would decrease 
15%. 


ClearView's potentlal market by 5% to 


of nearsightedness,” cannot be described 
by any of the standard probability distributions. Since the uncertaln- 


tles In this situation require a unique approach, Vision Research 
chooses Crystal Ball's Custom distribution to define growth rate. For 
the most part, the Custom distribution is used to describe situations 


that other distribution types cannot. 


This varlable, “growth rate 


The method for specifying parameters in the Custom distribution Is 
quite unlike the other distribution types, so follow the directions 
carefully. if you make a mistake, click Gallery to return to the Distrl- 


button Gallery, then start agaln at step 3. 


Use the Custom distribution to plot both the potentlal increase and 
decrease of ClearVlew's market. 


duct will be released on the market soon. 


The box remalns 
empty untll you 
enter values. 








Relative: Probability 








7. Press Tab and type 7596 In the Prob box. 


Ve the 75% chance that Vision Research’s competitor 
tot enter the market and reduce Vision Research's share 


8. Click Enter. 
Crystal Ball displays a uniform distribution for the range 0.00% to 


assumption cell for the growth rate of nearsightedness: 5 009% 


To define the 
1. Click cell C15. 
2. Choose Define Assumptlon from the Cell menu. 
Crystal Ball displays the Distribution Gallery dialog box. 


3, Click the Custom Distribution. 


4, Click OK. 
Crystal Ball displays the Custom distribution dialog box (notice in 
the example on the next page that the chart area remains emply 
until you enter the values for the distributlon). 
Uniform distribution 


for the first range of —-- 
values. 


To enter the first range of values: 


5. Type 0% іп the Value box. 

This represents a 096 Increase In the potential market. 
6. Press Tab and type 5% In the Value2 box. 

This represents a 5% Increase in the potential market. 
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Crystal Ball 
displays both 
ranges. 


To enter the second range of values: 


9, Type -15% In the Value box. 

This represents a 15% decrease in the potential market. 
10. Press Tab and type -596 In the Value2 box. 

This represents a 596 decrease In the potential market. 
11. Press Tab and type 2596 in the Prob box. 


This represents the 2596 chance that Vislon Research's competitor 
will enter the market place and decrease Vision Research's share 


by 596 to 1596. 


12. Click Enter. 


Crystal Ball displays a Uniform distribution for the range -1596 to 
-5%, Both ranges are now displayed In the custom distribution 


dialog box. 


l——————— M ин 


77 Cell C15: Custom Distribution — 





In Chapter 2, Understanding the Terminology, you will learn to use a 
special feature on the Custom distribution dialog box—the Data 
button. You can use the Data button to pull numbers from specified 
cell ranges on the spreadsheet rather than typing them In the Custom 
distribution dialog box. When you run the simulation, Crystal Ball 
generates random values within the ranges you specified. 


13. Click OK to return to the spreadsheet. 
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Defining Market Penetration: The Normal 
Distribution 


Glossary Term: 
Standard Deviation : 
The square root of the 


vatiance for a бъет, 


Ameanuecment of the 
dispersion of values 
ИТИИ ТИЛЕ 


Glossary Feri: 

Mean or Mean Value - 
Iba мины arithmetic 
Average of a sel О! 
numerical obyervalinrns 
(the sum of Ihe obyerva- 
lions divided by the 


number of abicrvatians). 


Glossary Term: 

Varlance - The square ol 
ihe standard deviation, 
ie., the average ol the 
squares of the deviations 
ol a number of observa 
tioni Irom their mean 
valuc. 


The marketing department estimates that Vision Research's eventual 
share of the total market for the product will be normally distributed 
around a mean value of 896 with a standard deviation of 2%, “Nor: 
mally distributed" means that Vision Research expects to see the 
familiar bell-shaped curve with about 68% of all possible values for 
market penetration falling between one standard deviation below 
the mean value and one standard deviation above the mean value, or 
between 696 and 10%. The low mean value of 8% Is a conservatlve 
estimate that takes Into account the side-effects of the drug that were 
noted during preliminary testing. In addition, the marketing depart- 
ment estimates a minimum market of 596, given the Interest shown In 
the product during preliminary testing. 


Vision Research chooses the Normal distribution to describe the 
variable "market penetration.” 


To define the assumptlon cell for market penetration: 


Click cell C19. 
2. Choose Define Assumption from the Cell menu. 

Crystal Ball displays the Distribution Gallery dialog box. 
3, Click the Normal Distribution. 
4. Click OK. 


Crystal Ball displays a dialog box showing the Normal distribu- 
tlon you chose for cell C19, 


DE Cell CI Normal рен TET loca 


1 = 
Asoumptlon Мате: |Market Penetration __ | 


š 
È 


9.20% 10.40% 


«Гену 


МАША 





Crystal Ball User Manual 31 


782 


Xl арас 


.32 


С J GELUI Orbe VYILH ы узи cuit 


Now specify the parameters for the Normal distribution: the mean 
and the standard deviation, 


5. The Mean box should contaln the value 8.0095. If it does not, 
type 8% in the Mean box. 


This represents an estimated average for market penetration of 
8%. 


6. Press Tab and type 2% Іп the Std Dev box. 
This represents an estimated 2% standard deviation from the 
mean. 

7. Click Enter. 


Crystal Ball scales the Normal distribution to fit the chart area so 
the shape of the distribution does not change. However, the 
percent range at the bottom of the chart does change. 


8. Press Tab twice and type 5% In the left end-point grabber box. 
This represents 596, the minimum market for the product. 


9. Cilck Enter. 
The distribution changes to reflect the values you entered. 





When you run the simulation, Crystal Ball will generate random 
values that follow a Normal distribution around the mean value of 
8%, and no values will be generated below the 596 minimum Ilmlt. 


10. Click OK to return to the spreadsheet. 
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Defining Forecasts 


Glosiary Term: 


Farecast Formula - 


A formula that has 
been «ете as a 
forecast cell, 





Now that you have defined the assumption cells in your model, you 
are ready to define the forecast cells. The forecast cells contain the 
formulas that refer to one or more assumption cells. 


The president of Vislon Research would like to know both the Икен. 
hood of achieving a profit on the product and the most likely profit, 
regardless of cost. Therefore, the president Is interested іп both gross 
profit (cell C21) and net profit (cell C23) for the ClearView project. 


Calculating Gross Profit 


Crystal Ball can generate more than one forecast when running a 
simulation. In this case, you will want to define both the gross profit 
and net profit formulas as forecast cells. First, look at the contents of 
the cell for gross profit: 


1. Click cell C21. 


The cell contents are displayed In the formula bar near the top of 
your screen, The contents are C16*C19*C20. Crystal Ball will use 
this formula to calculate gross proflt by multiplying Persons With 
Nearsightedness After One Year (C16) by Market Penetration 
(C19) by Profit Per Customer (C20). 


Now that you understand the gross profit formula, you are ready to 
define the forecast formula cell for gross profit, 


To define the forecast cell for gross profit: 


2. Choose Define Forecast from the Cell menu. 


The Define Forecast dlalog box is displayed. You may enter a 
name for the forecast. Since the forecast cell has a name next to It 
on the spreadsheet, that name Is displayed in the dialog box. 


Cell КАР Lom есас 
Forecool Мате: | ЛЗ? їч Ag privet АГ ЖР) 


E) Display Forecast Automatically Ф While Running 


О When Stopped {lanter 
fit 
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Use the forecast name that Is displayed, rather than typing a new 
name. 


Next, you will Indicate that the forecast chart Is In millions of doilars, 
since the spreadsheet model Involves millions of dollars, and request 
that the forecast chart be displayed during the simulation: 


3. Press Tab and type millions In the Units box. 


4, Click the Display Forecast Automatically box, if It Is not already 
checked. 


5. Click OK to return to the spreadsheet. 


Calculating Net Profit 


Before defining the forecast cell formula for net profit, look at the 
contents of the cell for net profit: 


1. Cllck cell C23. 


The contents are displayed in the formula bar near the top of 
your screen. The contents are: IF(C11,C21-C7,-C4-C5). 


The formula translates as: If the FDA approves the drug (C11 is 
true), then calculate net profit by subtracting total costs (C7) from 
gross profit (C21). However, if the FDA does not approve the 
drug, (C11 Is false), then calculate net profit by deducting both 
development costs (C4) and testing costs (C5) incurred to date. 


To define the forecast cell for net profit: 


3. Choose Define Forecast from the Cell menu. 


A dialog box Is displayed. 










[3 Display Forecant Automatically @ While Bunning 
© When Stopped Hamer) 


Again, use the forecast name that Is displayed In the Forecast Name 
box , specify millions In the Units box, and check the Display Forecast 
Automatically box. 
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3. Press Tab and type milllons In the Units box. 


4. Click the Display Forecast Automatically box, If it is not already 
checked. 


5. Click OK to return to the spreadsheet. 


You have defined assumptions and forecast cells for the "Vision 
Research” spreadsheet, now you are ready to run a simulation. 


Running a Simulation 


Glossary Term: 

Seed Value - The lirst 
number in a sequence ol 
random numbers, A given 
seed value will produce 
the same sequence ol 
random numbers every 


lime you run a simulation. 


When you run a simulation In Crystal Ball, you have the freedom to 
stop and then continue the simulation at any time. The Run, Stop, 
and Continue commands appear on the Run menu as you need them. 
For example, while you are running a simulation, the Stop command 
appears at the top of the menu. If you stop the simulation, the 
Continue command takes its place. Practice using these commands 
when you run the simulation for the ClearVlew project. 


Before you begin the simulation, specify the number of trlals and 
Initial seed value so your simulation will look like the forecast charts 
In this tutorlal. In Chapter 4, interpreting the Results, trials and seed 
value are discussed in detall. 


To specify the number of trials and Initial] seed value: 


1. Choose Run Preferences from the Run menu. 
Crystal Ball displays the Run Preferences dialog box. 
2. Type 200 in the Maximum Number of Trlals box. 


3. Cllck the Use Same Sequence of Random Numbers and type 1000 
In the Initlal Seed Value box, 


The value 1000 Is used as an arbltrary number for the Initial seed 
value, : 


4. Click OK. 
Now practice using the Run, Stop, and Continue commands: 


1. Choose Run from the Run menu. 


Crystal Ball displays the net profit forecast chart neatly stacked on 
top of the Gross Profit forecast chart. As the simulation proceeds, 
the forecast charts reflect the changing values In the forecast cells. 


2. Choose Stop from the Run menu on the front forecast window. 
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Crystal Ball updates the forecast charts to reflect the current 
values In the forecast cells. You can also stop the simulation by 
pressing Alt-U, О. 

3, Choose Continue from the Run menu on the front forecast 
window. 


Crystal Ball continues the simulation. You can also continue the 
simulation by pressing Alt-U, U. 


You may not be able to see two complete forecast charts at the same 
time. However, there are several ways to bring individual forecast 
windows to the front of the window stack. The easlest way 15 to click 
on the forecast window if It Is visible. 


1, Click on the Gross Profit If Approved forecast window. 
The Gross Profit forecast chart Is displayed. 
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2. Choose Forecast Windows from the Run Menu on the front. 
forecast window, Click Open All Forecasts to move the Net Profit 
forecast chart to the front again, 


. 
Excel Note: In Excel, each forecast window has its own menu bar. In Lotus 


1-2-3, there 15 one menu bar for all the forecast windows. 
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Frequency 


Certainty Level 


While the simulation Is running, Crystal Ball displays a frequency 
distributlon for each forecast to reflect the changing values in the 
forecast cell. The frequency distribution is displayed as columns on 
your screen. 


3. Continue to run the simulation until it stops at 200 trials, 











___ Богосав! Net Profit (MM) 
Edit Preferences View Run Help 


Probabili 













($15.0) $40.0 





($1.3) 


$125 $26.3 
millions 
> |-Infinity Certalnty [100.00 |% < | +Infinity 


A frequency distribution shows the number or frequency of values 
occurring in a given group Interval. In the example above, the 
frequency distribution on the Net Profit forecast chart shows a 
frequency of 9 for the group Interval that contains the most values. 
That means 9 values occurred in the group interval. Chapter 4, 
Interpreting the Results, describes how the forecast report provides а list 
of the group Intervals. 





Chapter 3, Setting Up and Running a Simulation, and Chapter 4, Inter- 
preting the Results, describe the forecast chart In more detall, For now, 
remember that the Forecast chart graphs the forecast results and 
shows how the forecast values are distributed. As the simulation 
progresses, Crystal Ball continues to update the frequency distribution 
for each forecast cell and the forecast results become more accurate. 
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Interpreting the Results 


Now that you have run the simulation, you are ready to Interpret the 
forecast results In more depth. The president of Vislon Research faces 
a difficult decision—should the company scrap the ClearView project 
or proceed to develop and market this revolutionary new drug? To 

examine this question you need to look at the forecast chart In more 


detail, 


Understanding the Forecast Chart 


Ce” 
Excel Note: Crystal Ball windows are separate from Excel windows. If 
Crystal Ball's windows disappear from your screen, they are usually simply 
behind the main Excel window and you can bring them to the front by 


pressing Alt-Tab. 


Crystal Ball forecasts the entire range of results for the Vision Re- 
search project. However, the forecast charts show only the display 
range which by default Is set to exclude the most extreme values In 
the forecast. Chapter 4, Interpreting the Results, describes how to 
change the display range to examine specific sectlons of the forecast 


In greater detall. 


in this example, the display range includes the values from minus 
$15.00 to $40.00, as shown on the Net Proflt forecast chart. 


The forecast chart also shows the certalnty range for the forecast. Ву 
default, the certainty range Includes all values from negative infinity 
to positive Infinity. In the next section, you will learn to change the 
certalnty range. 


Crystal Ball compares the number of values In the certainty range 
with the number of values In the entire range to calculate the cer. 
tainty level. The example above shows a certainty level of 10096, since 
the initial certainty range includes all possible values. 


Remember, the certainty level Is an approximation, since the spread- 
sheet model can only approximate the elements of the real world. 
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In the upper-right corner of the forecast chart, Crystal Bali shows the 
number of trials. This Indicates the number of trlals currently ђе! та 
shown In the display range. Since the display range 15 Initially set by 
default to exclude extreme values from being displayed, this number 
may sometimes be less than the total number of irlals. 


Determining the Certainty Level 


Left end-polnt 
grabber 


Range minimum box - ~ 


Certalnty Level 


Now the Vision Research president wants to know how certain Vision 
Research can be of achieving a profit and what are the chances of a 
loss. 


To determine the certainty level of a specific value range: 


1. Press Tab twice and type 0 in the range minimum box on the Net 
Profit forecast chart. 


2. Press Enter. 


Crystal Ball moves the left end-point grabber to the break-even 
value of $0.0 and recalculates the certalnty level. 









Forecast: Net Profit [MM] 
Preferences View Run Help 
Frequency Chart 
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Analyzing the Net Profit forecast chart again, you can see that the 


value range between the end-point grabbers shows а certainty level of 
82.595, That means that Vision Research can be 82.5% certain of 

achieving a net profit. You can therefore calculate an 17.5% chance of 
suffering a net loss (100% minus 82.5%). 
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Now the president of Vislon Research would like to know the cer- 
talnty of achleving a minimum profit of $2,000,000. With Crystal Ball 
you can easily answer thls question. 


3. Туре 2 Іп the range minimum box. 


Press Enter. 


Crystal Ball moves the left end-point grabber to $2.0 and recalcu- 
lates the certainty level. 


Forecast Net Profit [MM] 


- B 
Edit Preferences View Run Help 


! Frequency Char 200 Trials Shown 


($15.0) ($1.3) $12.5 $26.3 $40.0 
mions 


iam Certainty [78.50 |% 4 *Infinity 


Vision Research can be 78.596 certain of achleving a minimum net 
profit of $2,000,000, 


Vision Research Is very encouraged by the forecast results. The 
president now wants to know how certain Vision Research can be of 
achieving a minimum net profit of $4,000,000. If Crystal Ball shows 
that Vision Research can be at least two-thirds certain of a $4,000,000 
net profit, the president 15 ready to go ahead with the ClearView 
project. Again, Crystal Ball can easily answer this question. 


5. Туре 4 In the range minimum box. 
6. Press Enter, 


Crystal Ball moves the left end-point grabber to $4.0 and recalcu- 
lates the certainty level. 
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The Net Profit forecast chart (above) shows а certalnty level of 7096, 
With 70% certainty of a minimum net profit of $4,000,000, Vision 
Research decides to go ahead with the ClearVlew project and proceed 
to develop and market this revolutionary new drug. 





The president of Vision Research also Is Interested In the most likely 
profit regardless of cost. You now can analyze the gross profit forecast 
chart as you did the net profit chart. 


Summary 


[n this tutorlal, you have explored only a few questlons that Vislon 
Research might ask as they analyze the results of the simulation. As 
you read through this manual, you will learn to explore the forecast 
results In more depth. For example, you can customize the forecast 
charts, create trend charts, analyze the sensitivity of the model, 
Interpret the descriptive statistics, and print comprehensive reports 
for any simulation. Crystal Ball provides all these features so that you 
can be confident about achleving the results you are looking for. 
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The Crystal Ball Toolbar 


To ald In setting up spreadsheet models and running simulations, 
Version 3.0 of Crystal Ball comes with a customized toolbar that 


provides Instant access to the most commonly used menu commands. 


The Excel toolbar looks lke thls: 





The buttons In the first three groups are from the Cell menu: 


Define Assumption 
. Define Forecast 


= : Select All Assumptions 


[А Select АП Forecasts 


ЈЕ 
Copy Assumptlons/Forecasts (Excel) [AM] (Lotus 1-2-3) 
ey Paste Assumptlons/Forecasts (Excel) 3 (Lotus 1-2-3) 


Ld Clear Assumptions/Forecasts (Excel) (Lotus 1-2-3) 


The buttons from the next four groups are from the Run menu: 


125 Run Preferences 
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Forecast Windows 
|| Open Trend Chart 


Fd Open Sensitivity Chart 


Create Report 


ir] 
d] Extract Data 


The last button is from the Excel Help menu: 
m 
Help (Excel) d (Lotus 1-2-3) 


Open Crystal Ball (Lotus 1-2-3) 





In Excel... 


When you close Crystal Ball, it remembers the state of the toolbar. If 
the toolbar was showing when you closed Crystal Ball, it will be 
automatically revealed the next time you open Crystal Ball. If the 
toolbar was hidden when you closed Crystal Ball, it will remain 
hidden the next thme you start Crystal Ball, 


If you don’t want to use the Crystal Ball toolbar, select the Toolbars 
command from the Options menu. The Toolbars dialog box will 
appear. Select Crystal Ball from the Show Toolbars list (It's probably 
hiding at the end of the list; click on the bottom scroll arrow to reveal 
it). Click the Hide button. This hides the toolbar. You can get It back 
by clicking the Show button. 


In Lotus 1-2-3... 


The Lotus toolbar works and looks very much like the Excel toolbar. 
To show and hide the toolbar, use the pop-up menu on the bottom of 
the Lotus 1-2-3 main window. ! 
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Closing Crystal Ball 


Glossary Term: 
Forecast Delinitian = 
The lorecasl пате and 
parameters assigned to 
а cell in a Crystal Ball 
dialog bos. 


Glossory Term: 

Forecast Value - 

A value calculated by the 
forecast formula during 
an iteration, These values 
are kept in a list for each 
forecast, and are 
summarized graphically 
In the forecast chart and 
numerically in the 
descriptive statistics, 


At this point, you can close Crystal Ball and continue reading to learn 
how Crystal Ball takes the risk out of your spreadsheet analysis. 


To close Crystal Ball 


1. Choose Close Crystal Ball from the Run menu on the menu bar. 


A dialog box Is displayed, asking you to confirm your declsion, 
If you click OK, the Cell and Run menus are removed from the 
menu bar and Crystal Ball is closed. However, the “Vision Re- 
search" spreadsheet will remain on your screen. 


€ — — À—— ———  Ó———————]———— Ó———YÁ/YJV——— 
Crystal Ball Note: Crystal Ball will also close automatically when you exit 


from the spreadsheet application. 


Crystal Ball keeps your assumption and forecast definitions (but 
not the forecast values) with the spreadsheet. When you save 
your spreadsheet, the definitions are saved with it. To learn about 
saving forecast values, see the Save Run/Restore Run section in 
Chapter 3, Setting Up and Running a Simulation. 
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Understanding the Sensitivity Chart 


The Sensitivity Chart feature provides you with the ability to quickly 
and easily judge the influence each assumption cell has on a particu- 
lar forecast cell. During a simulation, Crystal Ball ranks the assump- 
tons according to thelr Importance to each forecast cell. The 
Sensitivity Chart displays these rankings as a bar chart, Indicating 
which assumptions are the most Important or least Important ones In 
the model. You can output (print) the Sensitivity Chart on the report 
or copy It to the clipboard, 


The Sensitivity Chart feature provides three key benefits: 


1, You can find out which assumptlons are Influencing your fore- 
casts the most, reducing the amount of time needed to refine 
estimates, 


2. You can find out which assumptions are Influencing your fore- 
casts the least, so that they may be Ignored or discarded alto. 
gether, 


3. Asa result, you can construct more realistic spreadsheet models 
and greatly Increase the accuracy of your results because you will 
know how all of your assumptions affect your model, 


Creating the Sensitivity Chart 


In the examples directory on your Crystal Ball disk there ts a "Toxic 
Waste Site” spreadsheet you can use to experiment with the Sensitlv- 
ity Chart feature, The Sensitivity Chart you create will display, in ` 
descending order, the assumptions In а risk assessment of a toxic 
waste site. The assumption with the highest level of sensitivity can be 
considered as the most Influential assumption in the model, 


To create a Sensitivity Chart: 


1. Close any spreadsheet windows that are currently open. 
2, Choose Open from the File menu, 
Open the TOXIC. XLS (Excel) or TOXIC.WK4 (Lotus) spreadsheet, 


Choose the Sensitivity Analysis option In the Run Preferences 
dialog box. 






Run Options | 
І |бепзійуйу Analysis] 
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Run a simulation 
6. Stop the simulation. 
Select Open Sensitivity Chart from the Run menu. 


A window will open displaying the sensitivity rankings of the 
assumptions In your simulation. 













MEN Sensitivity Chart 
Target Forecast: Risk Assessment 
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If you select Open Sensitivity Chart but forgot to make the appropri- 
ate selection In the Run Preferences dlalog box, you will need to reset 
the simulation and run It again. 


The assumptions (and possibly other forecasts) are listed on the left 
side, starting with the assumption with the highest sensitivity. 
Assumptlons appear as green bars and forecasts appear as blue bars 
unless you change the color settings. Use the scroll bar to view the 
entire bar chart, 


In this example, there are four assumptions listed In the Sensitivity 
Chart. The first assumption, Volume of Water per Day, has the 
highest sensitivity ranking and can be considered the most Important 
assumption in the model. A researcher running this model would 
want to Investigate thls assumption further In the hopes of reducing ' 
Its uncertainty, and therefore its effect on the target forecast. The last 
assumption, Concentration of Contaminant In Water, has the lowest 

“sensitivity ranking and is the least Important assumption In the 
model. The effect of this assumption on the target forecast Is not as 
great as the others and, In this particular case, could be Ignored or 
eliminated as an assumption altogether. Sensitivity charts like this 
one Illustrate that one ог two assumptions typically have a dominant 
effect on the uncertainty of a forecast. 
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How Crystal Ball Calculates Sensitivity 


Crystal Ball calculates sensitivity by computing Spearman rank 
correlation coefficients between every assumption and every forecast 
cell while the simulation is running. Correlation coefficlents provide a 
meaningful measure of the degree to which assumptions and forecasts 
change together. If an assumption and a forecast have a high correla- 
tlon coefficlent, It means that the assumption has a significant Impact 
on the forecast (both through Its uncertalnty and [ts model sensitiv- | 
ity). Positive coefficients indicate that an Increase In the assumption 
is associated with an Increase In the forecast. Negative coefficients 
imply the reverse situation, The larger the absolute value of the 
correlation coefficlent, the stronger the relationship. 


Crystal Ball also computes the correlation coefficients for all palrs of 
forecasts while the simulation is running. You may find this sensitiv- 
Ity information useful if your model contains several intermediate 
forecasts that feed into a flnal forecast. 


An option In the Sensitivity Preference dlalog box lets you display the 
sensitivitles as a percentage of the contribution to the variance of the 
target forecast, This option, called Contribution to Varlance, doesn't 
change the order of the Items listed In the Sensitivity Chart and 
makes It easler to answer questions such as "what percentage of the 
varlance or uncertalnty іп the target forecast 15 due to assumptlon 
X?". However, It is Important to note that this method Is only an 
approximation and is not precisely a variance decomposition. Crystal 
Ball calculates Contribution to Varlance by squaring the rank correla- 
tion coefflclents and normalizing them to 100%, 


Caveats 


The Sensitivity Chart feature has several limitatlons you should be 
aware ofr 


1. The sensitivity calculation may be Inaccurate for correlated 
assumptions. For example, If an Important assumptlon were 
highly correlated with an unimportant one, the unimportant 
assumptlon would llkely have a high sensitivity with respect to 
the target forecast. Assumptions that are correlated are flagged 
as such on the Sensitivity Chart, In some circumstances, turning 
off correlations In the Run Preference dialog box may help you to 
galn more accurate sensitivity information, 
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2. The sensitivity calculation may be Inaccurate for assumptions 


whose relationships with the target forecast are not monotonic. 
A monotonic relationship means that an Increase In the assump- 
tion tends to be accompanied by а strict Increase In the forecast; 
or an Increase In the assumption tends to be accompanied by a 
strict decrease In the forecast. 


For example, the relationship у = Log(x) 15 monotonic: 


While y = SIn(x) 15 not: | 


Customizing the Sensitivity Chart 


Use the Sensitivity Prefs dlalog box to customize the Sensitivity Chart, 
As you become more famlllar with the Sensitivity Chart, practice 
selecting preferences that help you get the answers you are looking for 
and are appropriate for the data you are working with. 


1, Click Sensitivity Prefs to open the Sensitivity Preferences dlalog 
box (an example appears at the top of the following page). 


The Target Forecast list box allows you to choose which forecast cell 1s 
the target of the sensitivity analysis. 


The Measure by option allows you to determine If the bar chart will 
show the sensitivities as rank correlations or contributions to varl- 
ance. Rank correlations range from -1 to +1 and Indicate both magni. 
tude and direction. Contributions to varlance range from 096 to 100% 
and Indicate relative Importance, 


The Include optlon lets you select which types of Crystal Ball data to 
be ranked against the target forecast, You have three display options: 
Assumptions, Forecasts, or Both (by selecting both Assumptions and 


Forecasts). 
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Statistics 


Target Forecast Forecast charts 


Risk Assessment • Percentiles 
@ Rank Correlation ° Frequency counts 
© Contribution to Variance * Assumption parameters 


• Assumption charts 


ËJ Assumptions To create a report Choose Create Report from the Run menu. 


О Other Forecasts | The Create Report dlalog box Is displayed, 


kaum ki 


Create Repot 5 


Cutoff Criteria 


(0 Display Only the Highest Sensitivities 
(0 Display Only Sensitivities Greater Than 


& Trend Chart % | | 
09 Зепв ућу Chart 100 | ж | | O Chosen... | | О Chosen... 
2] Forecasts О Open 


09 Summary Percentiles 


P3 ба је се О Quartlles (2596) 


09 Chart [100 ]% | | О Quintiles (2999 





Crystal Ball will always Include all the Assumptlons and Forecasts In 


your model even though they may be unrelated, Generally, this [s not L] Percentiles О Declles (1050) | 
an Issue since unrelated assumptions and forecasts will have sensltlv- C Frequency Counts Ф 2,5, 5, 50, 95, 97.59€ - tiles 
Ity rankings close to zero. However, correlation may affect sensitivity 09 Assumptions О 5, 25, 50, 75, 95% - tiles 
analysis If there Is a strong correlation between the varlables and at [X] Parameters O 10 

| | + 10, 25, 50, 75, 90% - Шев 
least one of the varlables 15 highly sensitive, | б] Char [m јх | | 


The Cutoff Criteria optlon glves you an even greater level of control 
over how many sensitivities appear on the Sensitivity Chart list by 
allowing you to assign values for count cutoff, value cutoff, or both. 





Select Information to be included In the report with the following 


H steps: 
eating Reports ч | : 
| 1. To Include all forecasts In the report, click All In the Forecasts 
À report can be created for each forecast using the Report command. optlons, : 
Any or all of the following Items can be Included in the report using 2. То Include only selected forecasts, and to specify the order In 
the Report dlalog box: whlch they will appear In the report, click Chosen In the Forecasts 
options. 
* Trend charts A dlalog box 15 displayed allowing you to choose from a 151 of 
* Sensitivity charts avallable forecasts, 
* Forecast summarles 
165 
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3. To Include only those forecasts which are currently open, click 15. To Include the a à sumption и Куа 
| Ореп In the Forecasts options. 


| tlon Information, click All In the As- | а, Фаина лр 
^ Jomptlons options. те | | reduce or enlarge the size of the assumption chart, enter the 
sumptions options. 


percentage In the 95 box. 
s. To Include only selected assumptions, and to specify the order in 


which they will appear In the report, click Chosen In the Assump- If you choose Percentiles from the Report Sections options, you now 


tons options. choose the percentiles you want displayed. 
| ou to choose from a list of 
A dialog box is displayed allowing y 17. To select which percentiles to Include In the report, click the 
avallable forecasts. appropriate button In the Percentiles optlons. 
lude the trend chart In the report, click the Trend Chart 
6. To Ein ga tha Report Sections options. To reduce or enlarge The Percentiles options for the forecast report show the certalnty 
en sss the trend chart, type the percentage In the % box. of achieving a value below a particular threshold. The first option 


divides the frequency distribution Into quartiles (four sectlons), 
showing the value levels for the following percentiles: 0%, 25%, 
50%, 75%, and 100%, The next option divides the distribution 
Into quintiles (five sections), The Deciles option creates 10 


7. To Include the sensitivity chart In the report, click the Sensitivity 
| Chart check box In the Report Sectlons options. To reduce or 
enlarge the size of the sensitivity chart, type the percentage In 


the % box. sectlons, The next three options show the value levels for the 
8. To Include forecast Information, click the Forecasts check box In following levels, respectively: 
` the Report Sections options. ° 2.590, 5%, 50%, 9596, and 97.5% 


The Forecasts optlon can Include the following subsectlons: 


9. To lnclude a brief summary of the forecast results, click the 
Summary check box. 


© — 596, 2596, 50%, 75%, and 9596 
° 10%, 2596, 50%, 75%, and 90% 


18. Click OK, 
le th | istics check box. 
10. To include the statistics, click the н! Crystal Ball creates the report as an Excel or Lotus 1-2-3 spread- 
11. To Include the forecast chart, click the Chart check сеци iim the sheet, You can modify, print, or save the report In the same way 
percentage in the % box to reduce or enlarge the slze o as any other spreadsheet. 


forecast chart. . 
pacco o ера se ADMIS Crystal Ball Note: To suppress the (типа! report header and any cell | 
The Percentiles option shows the certainty of achieving ee Соо О Noles тозаи fM DUM aod аде чы мит 

ги | IM Report from the Run menn. 
level. For example, If a report was printe 5 
са discussed in the "Vislon Research” tutorial In = 
Chapter 1, the Percentiles section might show that you coul = 
75% certain of achleving а net profit or loss below the thresho 


Crystal Ball Note: If the simulation has not been run or If it has been 
value of $17 million. 


reset, the report will Include only an assumptions section. Options In the 


1a ei Кына ime enunt ап Че D, "e E Fn ан dialog box affecting items other than assumptions will be 
| ck Frequency Cou ie abled. 
for the forecast's group Intervals, click the 
check box. ' 


The Group Intervals shows the starting value, ending value, 
probability, and frequency for each group Interval, 


14. To Include assumption Information, click the Assumptions check 
box In the Report Sectlons optlons. 


The Assumptions option can include the following subsectlons: 


Crystal Ball User Manual 
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Tutorial 


Tutorial—A Sample Application With Braincel 





>— 
The tutorlal shows you how you can set up a sample application # 


with Braincel. We'll show you how to teach а Braincel to maka» 
loan repayment probability forecasts based on past applications. 


Important: The Tutorial application is simplified for clarity. A 


real-world Loan Expert would need a larger database than the 
one we are showing here. | 


Setting ир a problem for Braincel 





Gather data relevant to the problem you're presenting to the 
program. 


To create our Braincel Loan Expert, we first decided exactly what 
we were looking to predict. We decided on Loan Repayment 
Ability. We will teach the Braincel Expert to determine how able 
each applicant is to repay a $2000.00 personal loan. 


Basic data was collected. For example, information on monthly 
income and expenses, how long the applicants had worked at 
their jobs, etc. Also collected was a human loan officer’s decision 
on the each application, assessing the ability of the applicant to 
repay the loan on a scale of 1 to 5. A "1" means very poor loan 
repayment probability and "5" means excellent loan repayment 
probability. 


Organize the data into two sets: inputs and outputs, Put the 
inputs and outputs in individual columns, 


* An inputis any data that is used by the expert to arrive at 
a solution, prediction or decision. 
—For the Loan Expert, the inputs are the 8 pieces of 
information collected on each. applicant. 


° An output is the solution, prediction or decision that 
Braincel will be learning to produce. 
—For the Loan Expert, the output is the loan off! — 's 
decision on the repayment ability of each applic, 
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Figure 1.1 BCDATA.XLS 


ola To see the data loaded for the Tutorial Loan Expert: 
Open BCDATA:XLS from the Braincel directory. 


This worksheet contains the data from 17 past loan applicants. 
Each applicant's informalion is in a separate row in the 
worksheet. The data is arranged database fashion, with each 
column as a separate input or output. We placed the output in 
the right-most column to make it easier to keep track of. 


Allow for minimum and maximum values for each data 
column. 


Minimum and maximum values let Braincel know what it can 
expect to see in each column, The program mathematically scales 
each column when it performs its calculations; the min/max 
values serve as the endpoints for this internal scaling procedure. 


Braincel will automatically calculate minimum and maximum · 
values for you. It uses two blank rows directly below the past 
loan applicants. 


Note: Non-numerical data is treated separately. Refer to 
Variations—Using text in your data. 
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Define three data sets for training and testing 





Your Braincel Expert needs to see the data in three different sets 
as part of its learning process. 


Using the Excel Defíne Name command, define three data sets 
as named ranges within the total available data. 


These ranges are called: the Training Range, the Test Range and 
the Predict Test Range. Naming these ranges with Excel makes 
it easier to reference them. 


Define the Training Range. 


This range should include approximately 90 % of your data 
records. This is the range that your Braincel Expert will learn 
from. Be sure to include two empty rows for minimum and 
maximum values 
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Figure 1.3 The Training Range highlighted 

The highlighted cells represent the training range for this Expert. 
Include all input and output columns for each record, with the min and 
max rows. 
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ia To define a Training Range on BCDATA.XLS: 
1. Activate BCDATA.XLS, if necessary. 
2. Select cells R7C3:R23C11. (Loan records 3-17 with 
the min and max rows) 
3. Select Formula Define Name. 
4. Name this range TRAINING RANGE. 


Define the Test Range. 


Instead of including the output column when using the Excel 
Define Name command, include an empty column. Braincel uses 
this column to write the output it calculates for each record. For 
the Loan Expert, we have titled the empty column Calculated 
Output. 


The Test Range is used to determine how well the Braincel 
Expert has learned to mimic the outputs it has seen during 
training. 
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Figure 1.4 The Test Range highlighted 

The highlighted cells represent the Test Range. The Braincel Expert 
will write its calculations into the empty column later, during testing. 
You compare its calculation with the correct answer in the adjacent 
column. We are testing on a small subset of the Training Range, 
records 6-10. 
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sila Tó define a Test Range on BCDATA.XLS: 


1. Activate BCDATA.XLS, if necessary 

2. Select cells R10C3:R14C10,R10C12:R14C12 as а 
complex range. (Both blocks of cells should be 
highlighted at the same time.) 

3. Select Formula Define Name. 

4. Name this range TEST RANGE, 


Define the Predict Test Range. 


This range consists of the 1076 of your data that you withheld 
from the Training Range. The records were not in the Training or 
Test Ranges, so they'll be fresh data for the Expert after it's been 
trained, later in the Tutorial. 


This range tests the predicting ability of your new Expert. By 
comparing the Expert's calculation against the historical output, 
you'll preview how accurate the Expert will be when given new 


data. 





Figure 1.5 The Predict Test Range highlighted 


ud To define a Predict Test Range on BCDATA.XLS: 
1, Activate BCDATA.XLS, if necessary 


2, Select cells RSC3:R6C10,R5C12:R6C12 as a complex | 


range. (Both blocks of cells should be highlighted at 
the same time.) 

3. Select Formula Define Name. 

4. Name this range PREDICT, TEST. 


Note: You may choose to continue the tutorial with 
BCDATA.XLS or you may open a worksheet that we have 
prepared, BCTUTOR.XLS. 
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Create an Expert file 


Now that you have finished setting up the worksheet in Excel, 
you are ready to use Braincel. 


un To create a Braincel Expert: | 
1. Select Braincel Braincel Menu in the Excel menu. 
The Braincel menu replaces the regular Excel menu. 


2. Select File New Expert. 
3, Fill the box as shown in Figure 1.6 below. 


New Expert 


Expert Name  |Tutorlal | 
Figure 1.6 New Expert box 


properly filled in for the 
tutorial 


Number of Inputs 


Number Of Outpute р] 





Field Definitions 


Network Name 


Enter the file name of your expert — Tutorial. (You can use up to 


eight characters. Braincel assigns all files created from the New 
Expert dialog box the file extension .NET.) 


Number of Inputs | 
Enter the number of input columns in the training range — 8. 


Number of Outputs 
Enter the number of output columns in the training range — 1. 


Note: The Password is optional and is not used in the Tutorial. 
See Full Menu Reference-New Expert for more information. 
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Training the Expert 





sila To Begin Network Training: 


1. Select Expert Train Expert 

2. Select BCDATA.XLS or BCTUTOR.XLS, the work- 
sheet where our training data is stored. 
Note: Only worksheets open in Excel will appear in 
this box. | 

3. Select TRAINING_RANGE from the Ranges box 
Note: All defined ranges on the selected worksheet 
will appear in the Ranges box 

4. ae TRAINING_RANGE to Selected Ranges. Click 

5. Set error for 5% and training time for 10 minutes. 

6. Allow Braincel to fill min/max rows. Press OK 

7. Accept bounded output type. Press OK 


Field Definitions 


Stop At Error (%) 

Error indicates how accurate the Expert is in calculating the 
output in your training data. Error refers to the average differ- 
ence between an output and the corresponding Expert calculated 
output, scaled for the range of the output. (For more details, see 
page 38.) The Expert will stop training when it has reached the 
error that you specify. 


Time (hh:mm:ss) 

Time refers to how long you would like to train the network. The 
network will continue training until the time is elapsed or until , 
the desired error 18 achieved, whichever comes first. Time is | 
measured in hours, minutes, and seconds (hh:mmiss). ` 


Note on training time: Braincel's training speed depends on how 
powerful a computer you're using to train. A math coprocessor, 
RAM memory and CPU speed all determine how fast Braincel 
runs. 


As soon as you press OK... 
THE EXPERT IS NOW TRAINING 
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Train Export 


Exper File: TUTORIALWNET 


Tralning Algorithm:  BackPropagallon 
Worksheet Ranges Selected Ranges 


[urma 


HCDATAXKIS _ 





Stop Al Error pé [Esc as] 






Figure 1.7 The two Train 






Expert boxes Ка 
We have filled in these boxes with Time (hh:mm:ss) [o J[vo] [o ] 
the correct values for the ` 

Tutorial. 





С 


The training process is begun and Braincel is teaching your 
Expert. You can monitor learning progress by watching the 
Braincel icon at the bottom of your screen. The icon is color- 
coded to tell you its status, The text is green or red while training 
and black when the Expert is finished training or merely open. 
(Note: While training can be set to hours, minutes, and seconds, 
the icon displays oniy hours and minutes.) 


Current 
Error 
(Average) Elapsed Time 
„ (In hours:minutes) 


12,8997 |.” 
00:10 





Bralncel 


As Braincel trains, watch the error in the Braincel icon. The error 
will change as the Braincel Expert learns. The overall trend of the 
error is downward as the Expert gets better at calculating the 
output. Often the error will increase for a time before decreasing. 
This may happen several times in a training session. This is 
normal. ! 
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. Continue training until the error is approximately 5% 


Add additional training time if needed to achieve a 5% error. 


Testing the Expert’s knowledge 





Test the Expert to make sure it has learned from the cases. 
Testing is done with two sets of data: the Test Range and 
Predict Test Range. 


ва To test Expert training on the Test Range: 


1, Select Expert Ask Expert. (If necessary, select 
Braincel from the Excel menu.) 

2. Select BCDATA.XLS as the worksheet. 

3. Add TEST_RANGE to the Selected Ranges box. 
Press ОК. 

4. Compare the Output column to the Calculated 
Output column. 
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Figure 1.8 Testing the Expert's ability on the Test Range 
By comparing the calculated output with the historical output, you can 


‘see that the Expert is doing very well on cases it has already seen and 


been trained on. 
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Note: If your own Tutorial.Net is not testing as well as the 
illustration above, continue training by selecting Train Expert 
and specifying a longer training time. 


When the Expert is calculating well on the Test Range, you are 
ready to test its predicting ability on the Predict Test Range. 


Testing the Expert's predicting ability 





The Predict Test Range will test the Expert's predicting capabil- 
ity. You'll be showing the network data it hasn't been trained on, 
so there's no chance of it having memorized the output. Since 
the Predict Test Range is your own historical data, you'll have a 
historical output to compare the Expert's output against. 


Ls To test the Braincel | Experts predicting ability: 
1. Select Expert Ask Expert. 
2. Select the worksheet to reset the Selected Ranges 
box. 
3. Add PREDICT. TEST to the Selected Ranges box. 
4. Compare the Output column with the Calculated 
Output column. 
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Figure 1.9 Comparing outputs on the Predict Test Range 
The accuracy on this range lets you know how well the Expert will 
perform on new cases. 
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Your own tutorial Expert should predict as well as the illustra- 
tion in Figure 1.9. The Experts calculated output should match 
the historical output. If it doesn't, check your Expert's error by 
selecting Options Network Status. If the error is higher than 5%, 
continue training by selecting Train Expert and specifying a 
longer training time. 


Using the fully-trained Expert 





Once the Expert has been fully-trained, you're ready to use its 
knowledge on new data. 


Create an area to hold new records for Braincel to analyze 


We've copied the column headings and placed directly below 
the training records. (Move down to ~R25C1.) Define a range to 
hold new loan application data and ask the Braincel Expert for 
its forecast on the loan repayment probability of the applicant. 


Йа To define a New Record Range on BCDATA.XLS 

1, Activate BCDATA.XLS, if necessary. 

2. Go to cell R25C1. 

3. Select R29C2:R29C10. 
This range includes all input columns plus an empty 
column for Braincel to write its output into. 

4, Using the Excel Define Name command, define this 
range as NEW_RECORD. (You may need to select 
File Return To Excel to use the Excel menu bar) 


LET ak n Dxrel 


= Ole ae eto Tere ET 
рн ааа ааа > 


HUDATA XIS 
Er ЛӘ 


ДЕ: ETS | 
prem 
[Syr пау ia За SIE TUTO 

BUE SE LL занава 


== 





= 
eel 
26 
E 
ЕД 
FI] 
129 
30 || ___ 
з 


Figure 1.10 The New Record Range 


The highlighted cells comprise the New Record Range. 
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Tutorial | | Tutorial 
Show the Expert new data and get its analysis | кы To ask the Expert on the New Record Range: 
1. Display the Braincel menu. 
in Figure 1.11 we're giving you a new loan application to present 
to your Braincel Expert. Enter this information into the New | ) nie “wow (т aga MENU SP негаз. 
Record Range and then ask your Braincel Expert for its analysis. | 3, Select BCDATAXLS to clear the Selected Ranges 
After you've entered the information, your range should now 4. ох, NEW RECORD to the Selected Ranges box 


look like Figure 1.12 - Press OK. 


5. When the Expert is finished processing, go to 
R29C10, the cell in the New Record Range where the 
Expert's output will appear. 


Loan Application 


t рова ју come | Figure 1.11 А new loan | 
2, Monthly Expenses li ie 

| application 
š А F Емрен Flle: TUTORIAL.NET 
desl eile cg Put the answers on this applica- S, 
fumma ll a tion into the appropriate - 


б. Years wih Previous Employer columns in the New Record TRAE 
8. Years at Present Addresa Ran ge. 


7. Years al Previous Addieta 


Ranges Selected Ranges 


| а, Number of Depandents 
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Figure 1.12 The New Record Range filled in Figure 1.14 The Expert's forecast 
Place the responses from the loan application in Figure 1.11 into 
the cells in the New Record Range, as shown above. 
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ae 
Just like the human loan officer, the Braincel Expert assigns loan 
risk on a scale of 1 to 5, 1 meaning very poor loan repayment . 
probability and 5 meaning excellent loan repayment probability. 


The Expert is linked to the New Record Range. You can enable a 
hot-link between the Expert and this range by selecting Enable 
Ask Link. Now, if you change any value in any of the cells in this 
range, the Expert will reevaluate the application. For example, 
select Enable Ask Link then change the value in the Monthly 
Income cell from 2500 to 3500 and see how that affects the 
Braincel Expert's prediction. The Expert will update its predic- 
tion automatically. You can disable this hot-link by selecting 
Disable Ask Link. | 


Now you're ready to move on to creating your own applications. 
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Creating a basic Braincel Expert 


This section is designed to provide details on the basic process: Excel 
worksheet setup, creating, training and using the Expert. It is a 
supplement to the Tutorial. 


This section shows you how to set up and use Braincel with the least 
amount of work using the Auto Expert User Mode (the default User 
Mode determined in the Options Setup box). Braincel will configure 
and monitor training automatically. | 


After using Auto Expert mode, experienced neural net users should 
refer to the Varlations section to learn about the Professional User 


Mode and its more advanced features, 
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Creating a Basic Braincel Expert 


Quick Checklist for creating a Braincel Expert 





° Choose a problem to solve or something to forecast. 


* Collect historical data as examples for the network to learn 
from (or use a human expert to create cases). 


ө Load the data Into an Excel worksheet, 


° Divide the data into inputs and outputs. 
° Each input or output into a separate column, one example 
per row. 
• Minimum and maximum values for each column as the last 
two rows. (These rows can be blank; Braincel will fill them.) 


* Define three data sets as named ranges (with the Excel Define 

Name command) See Diagrams on the next two pages. 

• Training range—90% of your records with min/max rows. 

° Test Range—the training range without the historical output 
columns and including empty columns for the expert to 
write its calculations in to (one empty column for each 
output). 

* Predict Test Range—the 10% of the total records withheld 
from training without the historical output columns and 
with empty columns for the expert to write its calculations in 
to. ; 

© Open a New Expert. 
@ Train the Expert with the Training Range. 


° Test the Expert on the Test Range (with the Ask Expert 
command). 


© Test the Expert on the Predict Test Range (with the Ask 
Expert command). 


• Use the Expert on new data by defining a range to hold the 
new“ - - then using the Ask Expert command. 


Creating a Basic Braincel Expert 


= for the Quick Checklist 

The following diagrams are not hard and fast rules; they are 
schematic representations, Details on worksheet organization 
— are in the pages following. . 


—— 





Basic Worksheet Setup 
Each input or output is in a continuous column. Minimum 
and maximum values are the last two rows. The outputs are 


the last columns. 
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The basic Training Range 

Should be approximately 90% of your records. Includes all 
inputs, the min/max value rows and the actual, historical 
outputs. 
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. Creating a Basic Braincel Expert | 


Diagrams for the Quick Checklist cont.... 





The Test Range 

Includes all inputs in the Training Range records plus an 
empty column for each historical output. Braincel will write 
its calculations in the empty columns, The min/max rows 
are not included. 
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The Predict Test Range 

Contains the 10% of your records that the Expert wasn't 
shown during training. (These records should be randomly 
selected.) Include ail input columns plus an empty column 
for each output. ! 





Creating a Basic Braince. егі 





What can I do with Braincel? 





Braincel is great for forecasting and building expertise. It works 
best with problems that share the following characteristics: 


° The cost of developing rules is prohibitive. 
° The formulas are constantly changing. 
e Lots of data is available. 


Within this framework there are two approaches to a problem 
with Braincel: 1) Imitating human expertise where the rules for 
solution are unknown, like the loan expert in our tutorial; 2) 
Information mining—looking for relationships within data. This 
approach is best used by people with expertise in the field they 
are studying but want to look at the data their data in new ways. 


Examples: 


• "How many widgets will 1 need in my inventory?" 
ə "Who is the mostly likely winner of today's horse race?" 


Braincel should not be used when the formulas for a decision are 
already known and are fairly static. For instance, you shouldn't 
use a neural network to calculate an 8% sales tax. It's much 
easier to multiply by 0.08 than to teach the network to multiply 
with accuracy. Neural network experts should be used to solve 
problems for which it is difficult to formulate a procedural 
software solution or if the procedure is likely to change fre- 
quently. 


Another factor to consider is how you want to present the 
problem to the neural network. If you have a problem with 
several outputs, you could break the problem into several 
networks, each with one output. Additionally, you can use the 
output of one or more networks as the inputs to another net- 
work. For example, for the tutorial loan Expert, we could have 
constructed one network that estimates the stablility of a loan 
applicant, based on employment record and housing record. 
Another network estimates the amount of money an applicant 
can pay, based on assets and expenses. The outputs of these two 
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Creating a Basic Braincel Expert 





networks are inputs to a third network that evaluates the total 
loan. 


Network 1 





Applicant's stability 
(based on housing . 
and employment) 











t 
Loan Repayment ab 







Network 3 


|пооте/ехрепвев) 


Network 2 ; 


Figure 2.1 Using the outputs of Experts as inputs to another 
Expert 

This example shows an alternative configuration for the tutorial loan 
Expert. Two networks assess specific risks. Their outputs are used in a 
third Expert that assesses the total risk. 


Braincel is good for a large range of problems from forecasting 
the stock market, forecasting sales, filtering mailing lists, to 
predicting horse races. | 


Example: Stock and Commodities 


_Use the factors that an expert in the field of forecasting futures 
uses to analyze. You can also include a couple of factors whose 
Influence you're unsure of. If these factors are immaterial, the 
network will minimize their effect in calculations. 


Example: Mailing list filter 


Keep statistics on the efficacy of past mailings, as many factors 
as you have access to (Hint: you could store this information ina 
dBASE file and link it to the worksheet via Excel's 0+E™.) Then 
train the network to predict, for example, how accurate a mailing 
will be, based on the success or failure of test mailings. 


The neural network is good at this because you can include 
factors that may or may not be immediately obvious, such as 
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economic factors. Braincel's calculating power allows you to add 
these factors without worry: if Braincel finds that they're impor- 
tant, it will use them. If they aren't important, Braincel will 
minimize their influence in its calculations. 


Example: Scientific or Technology Applications 


Braincel is an excellent tool because you can use it to minimize 
expensive and time-consuming experimentation by pinpointing 
likely success. 


For example, let's say we're developing a new type of plastic for 
a customer. Our customer will request various physical proper- 
ties for the plastic. 


We create a neural network Expert. We use as inputs the physi- 
cal properties of plastics made for previous customers. The 
variables used in making those plastics are the outputs. We train 
the Expert on those past cases. Then we ask it which variables to 
use to get the new physical properties. The plastic will be made 
at a much lower cost since fewer actual experiments must be 
done. The neural network has pointed the way to making the 
new plastic. 


Other applications include analyzing physical properties of 
metals and drugs. It’s an excellent tool to for making discoveries 


in engineering and scientific disciplines. 


What kinds of data should I include as inputs? 





In general, you should include any information that would be 
helpful to a human decision-maker. The human decision-maker 
is known as a domain expert. This person has knowledge in the 
field under consideration. For example, when constructing a 
loan approval Expert like the one created in the tutorial, you 
should be in contact with a human loan approval officer to direct 
you towards the type of information important to such a deci- 
sion. This person would be the domain expert. 
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с == Details on organizing your data in Excel 


For example, in the tutorial we created a loan Expert. The loan 
Expert was given eight inputs even though a loan approval 
decision could probably be made on the basis of just two ques- 
tions: Monthly Income and Monthly Expenses. If a person's 
expenses are greater than his income, he shouldn't be given a 
loan. But the human loan expert who provided the historical : 
data looked at several other factors. These factors helped the 
expert assess each applicant's loan repayment probability more 
completely. For.example: 


eHow long has the applicant lived at her current address? 
eHow long has she worked at the same job? 


These are important to the human expert, although she may not 
be able to explain why. Therefore, we gave all the information to 
the Bralncel Expert. 


When you are developing your own Expert, include as many 
factors as you consider necessary. Humans often overlook their 
own decision-making criteria, so it's better to give the program 
more inputs rather than fewer. Too many unimportant inputs, 
however, may increase training time unnecessarily. 


Additionally, you should exclude factors that measure the same 
feature, For example, if you have "Gross National Product 
(GNP)” as one input in a financial model, you shouldn't have 
"GNP in 1982 dollars" as another input. Though they're different 
factors, they're really measuring the same thing. However, one 
may be more helpful to the neural network than the other. You 
could try making two networks, one with GNP and one with 
GNP in 1982 dollars. One may be more accurate than the other. 
This type of data manipulation is called preprocessing. There are 
many types of preprocessing. Some options are discussed in the 
Varlations- Improving your Expert's accuracy. 


Braincel will accept data in many forms: as numbers, formulas 
or, as we explain in the Variations section, text. You can have a 
combined total of 256 inputs and outputs. 


There are three basics to consider when organizing your data: 


1. Each input or output must be in a separate column. 

2. Each column must have space for a minimum and 
maximum value as the last two rows. 

3. Define three data sets as named ranges: the Training 
Range, the Test Range and the Predict Test Range. 


Inputs and outputs in separate columns 

Each column of data must be continuous in the defined range. 
For example, an input can't consist of rows 1-50 and 65-70. The 
input should be rows 1—70. It’s ok for rows 51-64 to be blank as 
long as they are Included in the defined range. 


The outputs must be placed as the right-most columns when 
using a single range to define the Training Range. (You may use 
more than one range to define the total Training Range. In this 
case, the outputs need not be the right-most column or even next 
to each other. This is explained in Variations—Using Multiple 
Ranges, pp 56-59.) ' 

Minimum and maximum values 

Each column of data must have a minimum and maximum value 
at the bottom in the last two rows. Braincel is using these min 
and max values to determine what it can expect to see in the 
column. 


You can set your own minimum and maximum values or 
Braincel will calculate min/max values for you. The inputs аге  : 
given a minimum value slightly less than the actual minimum 
and a maximum value slightly greater than the actual maximum. 
The range for an output depends on what type of output it is. If 
the problem is a classification problem (output is basically Yes/ ` 
No or some variant) or the output should never fall outside the 
cases presented in training, then we call this a bounded output. 
If the output is a number that could exceed the scale presented in 
training, then we call this an unbounded output. An example of 
an unbounded output is a raw stock price. A bounded output 
would be a buy /sell signal on the stock. 
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Fill In Min/Max 

Min/max autofill 

Braincel will fill your min/max 
rows depending on which type of 
output you have, bounded or 
unbounded. 


тия шне а й uem бт Om тч, kuma o чала art aaa ca i 


@|Bounded Outputs) h 


O Unbounded Outputs 





If you choose to set your own min/max values, use common 
sense in your choices. For example, in the tutorial Loan Experts, 
there was a "Number of Dependents" input. We could give this 
column a minimum value of 0 and a maximum value of 10. It's 
impossible to have fewer than zero dependents and very few 
people in the real world have more than 10 dependents. (No 
applicant in the training range had more than 5). Setting a 
maximum value of 100 would cause scaling problems for the 
Braincel Expert because it would consider having 9 dependents 
almost the same as having 1. This wouldn't show the network 
how different having 9 dependents is from having 1 dependent. 


Three necessary data sets: The Training Range, Test Range and 
Predict Test Range 


Braincel needs to see your data in three different sets as part of 
its learning process. 


The largest set is the Training Range. It should include 
approximately 9076 of your data records. (You leave out 1075 of 
the records for testing later in the training process.) All input 
and output columns, with their minimum and maximum rows 
are included. The Training Range needs to be large enough to 
show the Expert-in-training as many possible combinations of 
inputs as your machine memory and time permit. The network 
is learning by example, so the more examples you have, the 
better. 


ce: 
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The second set is the Test Range. This range is used to see how 
accurate the Expert is on specific records in the training range. 
The Test Range forces the Expert to write its calculations for each 
record onto the spreadsheet so you can compare the calculations 
to the actual historical output. Now you can see how the 
network is performing on individual cases. This gives you a 
more accurate representation of the Expert's performance than 


the average error percentage you see in the Braincel icon. Since 


the error percentage in the icon is an average, it deemphasizes 
the records that the Expert is not learning well. 


The Test Range includes all input columns plus an empty 
column for each historical output. The minimum and maximum 
values are not included. You can test on all training records or 
on a smaller subset. 


The third set is the Predict Test Range. This is the 10% of the 
records that you withheld from training. Include the input 
columns for these records, plus an empty column for each 
output, The Braincel Expert will write its calculations into these 
empty columns. 


The Expert-in-training hasn't seen these records. Testing on this 
range will give you a good idea of how accurate the Expert will 
be when it's given new data. Since this is historical data, you 
know what the correct output should be. Compare the historic 
output with the Expert's calculated output. This test will give 
you a good idea of how well the Expert will predict on new data. 


When choosing records for the Predict Test Range, select records 
that represent the full spectrum of your data. For instance, in the, 
tutorial loan Expert, we selected two records representing two 
different loan officer responses out of five possible. The tutorial 
was based on a tiny database. In your own application, use the 
Excel database functions to select a representative sample. This 
will show you whether or not the Expert is accurate throughout 
the database. 


These three data sets do not have to be single ranges. They can 


each be made up of several named ranges. See Variations— 
Using Multiple Ranges for information. 
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Details on the Training/Testing process 
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Braincel Expert learns the relationships in the data during the 
training and testing processes, Braincel self-monitors most of the 
learning process. The dynamic internals of the neural network 
(for example, learning rate) are controlled and the physical 
internal configuration is set by Braincel itself. You can override 
this self-monitoring. See Variations—Working in Professional 
User Mode. 


Just like teaching a human, Braincel requires a bit of supervision 
' while learning. Primarily, the user determines how accurate the 
Expert should be on the training data. This accuracy is deter- 
mined by the error specified during the Train Expert command 
Also, the user specifies how long the Expert should train. — | 


Determining an acceptable error 
Different problems require different levels of accu racy. You want 
the expert to be accurate, but not so accurate that it has memo- 
rized the training data, causing it to perform poorly on new 
cases. The error in the icon is an average of the errors for each 
iade) The error for an individual record is calculated as 

OWS: 


Error = Average(Hist. Output - Expert Calculated Output) 
Standard Deviation of Hist, Output | 


_ For example, the tutorial loan Expert had a min-max range of 1- 
5. If the network calculated a 4 and the historical output was a 3 
the error for that record would be 25%. | 


It's best to interrupt training occasionally and t 

against both the training dela and the mals illite dese би 
is, the Test Range and the Predict Test Range. In this way; you'll 
see how accurate the Expert will be when asked to analyze new 
data. As long as the Expert continues to Improve performance on 
the Predict Test Range, continue training to a lower error. As 
soon as performance worsens, Stop training. | 


You can automate this training and testing process in two ways: 
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1) Use Automated Best Net Search (details pp 41-47); 2) train on 
Unseen Data in Professional User Mode (details p. 70). 


Determining an appropriate tralning time i 

It is very difficult to determine in advance what an appropriate 
training time will be. In general, the more records you have, the 
longer the network will need to analyze the records. Also, the 
more inputs and outputs in each individual record, the more 
time the network will need. 


You may specify a training time of up to 99 hours 59 minutes 
and 59 seconds. If you need more time than this, you can set the 
Expert for additional cycles after the previous cycle is complete. 


Making sure the Bralncel Expert is learning 

It is necessary to test the Expert during training. Only by testing 
will you determine how accurate the Expert truly is. The error 
percentage in the icon refers to the average difference scaled for 
the standard deviation of the output. 


This doesn't necessarily give you a true picture of how accurate 
the Expert is. For instance, we recommended that you test your 
loan Expert after it has achieved a 10% percent error. A 10% 
error may sound very high to you, but In fact it may be desir- 
able, depending on your application. The error doesn't indicate 
how many times the network gives a correct answer. Consider 
the following example. | 


Example: 
You have constructed a rainfall-predicting Expert in which 


an output of 1 means rain; 0 means no rain. The Expert 
calculates 0.75 for a particular day. 0.75 is closer to 1 than it : 
is to 0, so the Expert predicts rain. If the weather is rainy that 
day, the Expert has given the correct answer. BUT its error is 
listed at 25%. 


It's not necessarily desirable to train to an extremely low error, 
like 1%; if the error is too low, the network will have trained 
itself too well. It will have memorized every record individually 
and not be able to generalize its knowledge to records it has 
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never seen before. You must train and test, adjusting the error to 
get the results on the Predict Test Range that are adequate for 
your needs, 


Generally, the Expert will not perform as well on the Predict Test 
Range as it did on the Test Range; it has never seen the data in 
the Predict Test Range before. You can expect the error percent- 
age on the Predict Test Range with a well-trained Expert to be up 
to 10 points higher than it was on the Test Range. 


When you're satisfied with the Expert’s accuracy on the Predict 
Test Range, training is completed. Your Expert is ready to 
examine new data. 


Note: When creating your own Braincel Experts, if you are unsatisfied 
with the network's accuracy on the Predict Test Range but were 
satisfied with the network's performance on the Test Range, refer to 
Troubleshooting—Poor Performance on Predict Test Range. 


Can I use other Windows applications while training? 

You can enable task switching in Options Setup. This will allow 
you to use ALT-TAB to interrupt Braincel and switch to the 
Program Manager and hence other applications. However, 
Braincel will halt training until Excel is again the current applica- 
tion, 


Notes on using the fully-trained Expert 





Your Expert is ready to be used on new data. You can analyze 
this data by defining a New Record Range and putting the new 
data into it. The inputs must be entered in the same order they 
were presented for training. If you don't follow the training 
order, the calculated result will be unpredictable. 


What does the New Record Range include? | 

The New Record Range includes columns to hold the inputs and 
an empty column to hold each calculated output. There are no 
minimum and maximum value rows. We recommend that you 
paste a copy of the same headings that you used in your training 
and testing ranges above your New Record Range. This will 
make! ` ier for you to keep your inputs in the same order as 


Creating a Basic Braincel Expert 
you presented them in training. er 


The link between the network file and the worksheet 

After you've analyzed the new data with the Ask Expert com- 
mand, the Expert file can be hotlinked to that range. You can 
enable a hot-link by selecting Enable Ask Link in the Expert 
menu. With this hot-link, the Braincel Expert will automatically 
recalculate the linked range any time a value in one of the linked 
cells changes. | 

How can I use the Expert on a different worksheet? 

You can overwrite the worksheet and ranges in the Ask Expert 
box and start filling the Selected Ranges box with ranges from a 
different worksheet. 


All worksheets open in Excel will appear in the Worksheet list 
box in Ask Expert. 


Modifying Expert training 

You can modify your Expert's knowledge by giving it more data 
and retraining the Expert. For instance, the human loan approval 
officer could keep recording his or her cases and giving that 
information to the Braincel loan Expert. So, if the human modi- 
fled his or her decision-making criteria, the Braincel Expert 
would see this pattern arid modify its decision making. 


Note: When you update your Expert's knowledge, you must add 
the new cases to the original training range then start training 
again. Your training range will increase in size and the training 
time may also increase. 
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Introduction 


In this chapter, we will cake you through the entire Evolver process step by 


. step. We will start by opening a simple spreadsheet model and defining the 
problem to Evolver. Then Evolver will optimize the problem, searching for 
the best solution. We will also discuss many of Evolver's special features. For 


additional information about any topic, see Chapter 5: Reference. 


Opening the Tutorial 


If you do not have Evolver installed on your hard drive, please refer to the 


installation section of Chapter 1: Getting Started and install Evolver before 
you begin this tutorial. 


1. Open Microsoft Excel for Windows. 


Excel will automatically open a blank spreadsheet titled "Sheet." 


| Microsoft Excel 
File Edit Fermula Format Dats Options Места Window Help 
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Excel presents you with an empty worksheet 


2. Close the blank sheet. 


3. Under the “File” menu, select “Open.” 
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4. Open the "tutorial.xls" file located in the "examples.ev2" subdirectory. This 
eee is inside the “xistart” subdirectory, which is located in your Excel 
irectory. 


The “tutorial.xls” worksheet appears as below: 


[= ° g TUTORIAL.XLS i ; = 
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ibes а resource 
allocation problem where each worker must perform one task. 


The “tutorial.xls” spreadsheet descr 
This worksheet models а common problem involving resource allocation. In 
this problem, you have 10 workers to perform 10 tasks. Each worker’s ability 
to perform each task is rated on a scale of 0 to 10 (O= cannot do the task, 10- 


perfect at the task). The challenge here is to match each worker to a task so 
that the overall productivity of the workers is maximized. 


Ihe model provides a 10 x 10 grid in which each worker has been rated for 
each task. The "Chosen Task” column (column N) to the right of the grid 
arbitrarily assigns each worker to one task. The next column over (column P) 
checks what task was assigned, and enters each worker's rating for that task. 
Finally, the total "score" of the entire solution in cell P15 is the sum of 
adding up all of the individual ratings (see screen below). 
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In the model, Robert is assigned task 2, at which he rates a "9". 


5. Click on cell М5 (currently a 2), and change the value to 6. 


When you enter the new value, the model sees that Robert rates a “4” at task 
#6 and recalculates the Total Score to “61.” 


6. IMPORTANT: Change the value of cell N5 back to 2. 
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Although this problem may seem simple, there are over 3.6 million possible 
ways to assign these workers to their tasks. If we added just one more 
constraint to the problem (e.g. if certain tasks required 2 workers, or some 
tasks required other tasks to be completed first), the problem's complexity 
would increase exponentially. 


7. Pull down the “Formula” menu and select “Evolver.” 









Formula 


' m 


Paste Function... 





Define Name... 
Create Names... 













Goto... F5 
Find... Shift+F5 
Replace... 
Select Special... | 
Show Active Cell | 
Outline... 
Goal Seek... 
Solver... 
Evolver... 
















NOTE: If Evolver is not available, check to make sure that you have installed 
it correctly. The file "evstub.xla" adds the “Evolver” item to the bottom of 
Excel's “Formula” menu. If this file is not in your "xlstart" directory, you must 

| manually open this file from within Excel. For more installation information, | 
see Appendix B at the back of this manual. 







When selected, the “Evolver” command takes about 10 seconds loading in the 
“Evolver.xla” file and all of the Evolver solving methods. Evolver will only 
load once, and will remain available as long as Excel is open. 


Selecting Evolver opens the following Evolver main dialog: 
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Ihe Evolver main dialog is designed so users can describe their problem in a 
simple, straightforward way. In our tutorial example, we are trying to find the 
combination of workers to tasks thar produces the maximum overall “score.” 


8. Set the “Find the...” Setting to “maximum value.” 











3. Check that cell $P$15 is е ast 


x 
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All information in Evolver can be entered in the dialogs in two ways: you may 
type the reference into the field with absolute coordinates ($B$4, not B4), or 
with your cursor in the selected field, you may click on the cell(s) directly 
with the mouse. To select the spreadsheet cells underneath the dialog, drag the 
dialog to one side." 


en 


* To drag and move any window (or dialog), click on the title bar across the top of the window and hold 
down the mouse burton while you drag it со a new location. 
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The Variables 


To complete the description of the problem, you must specify the location of 
the variables that Evolver can adjust as it searches for a better solution. 
Evolver can handle an unlimited number of variables. All variables are added 
and edited one block at a time, through the variables dialog. 





10. Click the “Add” button. 


When you "Add" or “Edit” variables, Evolver will present the following 
variables dialog box: | 
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field 
solving 
method 
field 
variable 
Options 





The variables dialog prompts the user to select the 
variables and choose how they should be treated. 


You will enter a block of variable cells in the variables field, and then specify 
a solving method to be used on those variables. Different types of variables 
are handled by different solving methods. The “recipe” solving method, for 
example, treats each variable as an ingredient in a recipe; each variable's value 
can be changed independently of the others'. In contrast, the "order" solving 
method swaps values between the adjustable cells, trying out different 
permutations of the original values". In our tutorial example, we have a block 
of variables in column N that we want Evolver to adjust. 





NOTE: When you are modeling your problem in Excel, remember to group 
together all of the variables you want to be adjusted in a certain way. This 
way, each group or block of variables can then be defined by the upper left cell 
| and the lower right cell. A separate sub-problem should be added for each 
distinct block of variables. 







11. Drag the variables dialog to the left . 


* For more information, see the "Solving Methods" section in Chapter 5: Reference. 
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12. With the cursor in the variables field, drag the cursor across column N4-N13 (see 
dialog below). 
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Drag the dialog to the side, then with the cursor placed in the 
variables field, drag across the block of variables to enter the cells. 


As you drag across cells in your worksheet, your variables field will show the 
block of cells being entered. Your variables field should now read 
$N$4:$N$13 as in the screen above. 


The Solving Methods 


In Evolver, different types of variables are handled by different solving methods. 
Now that we have chosen our variables for this problem, the second part of 
the variables dialog involves choosing which solving method should be used. 


Let us say that you do not understand which solving method to use. If you 
have any questions regarding Evolver, you may want to use the Eyolver on- 
line help file. 


Using Help 

Both the Evolver main dialog and the Evolver variables dialog offer Help 
buttons. You can also access Evolver Help at any time (while Evolver dialogs 
are visible) by pressing the Fl key on your keyboard. 


13. Push the “Help” button, or press the F1 key on your keyboard. 
This will call up the following Evolver Help window: 
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Evolver 2.0 forExcelHelp-E24XL.HLP 1:1 
Edit Bookmark Help | 
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General 
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Installation 
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Main Dialoge 
Sowing Methods 


Reference 
About Onimigaten 
Satversvs Eyotver 
Programming With Evolver 
netic Alcomhms 
Асато Learning Resources 





The Evolver Help window is just a mouse click away. 


The Help window contains information indexed by topic. You can select any 
topic by clicking on it, or by clicking on the “Search” button to lookup a 
specific term. By navigating through the Help window, you can learn more 
about solving methods, and find examples of where to use which method. 





NOTE: The Evolver Help system is designed like most standard Windows help | 
systems. If you are new to using this type of Help system, refer to your Excel 
User's Guide, or select the "Using Help" topic. _ 







14. To learn more about solving methods, click on the "Solving Methods" term. 
15. Read about the "Recipe" and “Order” solving methods. 


You learn that these two solving methods are che most popular, and that they 
can be used together to solve complex combinatorial problems. Specifically, 
you learn that the "recipe" solving method treats each variable as an 
ingredient in a recipe, trying to find the "best mix" by changing each 
variable's value independently. In contrast, the "order" solving method swaps 
values between variables, shuffling che original values to find the "best 
order." 


In this problem we are looking for the best way to shuffle the existing 
variables, so che "order" solving method should be used. If you are still 
unsure about which solving method to use on your variables, refer to the 


3 
or 
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Solving Method section of Chapter 5: Reference for a complete description of 
each method and the specific options that accompany them. 


16. Close Evolver Help and return to the variables dialog. 


17. Click on the button to the right of the solving method field, then select the 
“order” solving method. 





18 Push the "OK" button. 


Ihis returns you to the Evolver main dialog. 
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You will notice that the variables you selected, along with the solving method 
to use on those variables, are now listed in the variables field of the Evolver 
main dialog. Each set of variables, and the settings for how they should be 


treated, is a sub-problem (see Chapter 5: Reference). 
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If chere were additional variables in this problem, we would continue to add 
sub-problems for each set of variables. In Evolver, you may create an 
unlimited number of sub-problems. 
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13. Click the "Add" button again. 


А new Evolver variables dialog will appear which allows you to choose new 
variable cells. In this problem, however, we only have one sub-problem. 


20. Click “Cancel” to return to the Evolver main dialog. 


Later, you may want to check the variables or change some of their settings. 
To do this, simply click on the sub-problem you would like to inspect (the 
sub-problem will appear highlighted in inverse), and click the “Edit” button. 
You may also select a selected sub-problem and delete it by pushing the 
"Delete" button. 


The Stopping Conditions 


Evolver can run as long as you wish. The stopping conditions allow Evolver 
to automatically stop when either: a) a certain number of scenarios or “trials” 
have been examined, b) a certain amount of time has elapsed, or c) по | 
improvement has been found in the last п scenarios. 


You can select any combination of these three stopping conditions, or none at 
all, through the main Evolver dialog. You can also stop Evolver manually by 
pressing the "Esc" key while Evolver is running. 
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This option sets the Evolver will run for the | This stopping condition 
number of trials that you | specified amount of time | is the most popular 
would like Evolver to until it stops. because it keeps track of 
run. In each trial, Evolver | the improvement and 
evaluates one possible . allows Evolver to run 
solution or scenario. until it is no longer 
finding better solutions. 


21. Turn on the minutes option only, and change the number of minutes to 3. 
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The Screen Options 


When Evolver runs, you can set two viewing options. These settings 
determine what you see while Evolver is running. 


graph best results 
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redraw screen 


| This option redraws the screen after 
each calculation, allowing you to see 
Evolver adjusting the variables and 
calculating the output. We suggest 
this option be turned on while you are 
learning Evolver, but later turned off 
| to increase Evolver's speed. 


When this option is selected, Evolver 
will build a graph and update it after | 
every 20 scenarios plotting the | 
solution so that users сап see how 

| their problem is progressing as 

| Evolver is optimizing. 
















22. Select the "redraw screen" option, and de-select the graph best results screen. 
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The following screen illustrates what your Evolver main dialog should look 
like at this point: 
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This is the Evolver main dialog completely filled out. 


Running Evolver 
Once the problem is defined and the stopping conditions are set, you can run 
Evolver by clicking the OK button. Evolver will redraw the screen, showing 
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you how it is trying different scenarios. It is using the generic algorithm to 
weed out the poor solutions and let the fittest solutions survive and 
reproduce. For more information about how Evolver is searching for 


solutions, see Chapter 7: About Optimization. 


Regardless of whether you have asked Evolver to redraw the screen, you can 
always tell if Evolver is running by the message in the status bar located at che 
bottom of your screen. The status bar also shows the best solution Evolver 
has found so far (see below). 


' Best=65 





Evolver in progress: press Esc to interrupt... 







"ч 


Evolver will stop after che 3 minutes have passed, and will present you with 
the following stop alert window: 


| The time limit has been reached. 
| Best solution so far = 83 


Y ou can either Accept the best solution found. Restore the 
original values of the variables, or Continue evolving. 


The Stop alert tells you when your stopping conditions have been met. 


The three options are: 
















This option allows you 
co continue searching for 
solutions. When your 
original stopping 

| Condition/s are met again, 
Evolver will stop again. 


This option restores all 
of your variables co their | 
original values, and 
returns to the 
spreadsheet as it was 
before you ran Evolver. 










selection. Accept will 
place all of the variable 
values from the best 

| solution so far into your 
| Spreadsheet. 


23. Click on the "Accept" button. 


You will be returned to the tutorial spreadsheet, with all of the new values 
that created the solution. Although in this example Evolver found a solution 
which yielded a total score of 83, your result may be higher or lower than this. 
Evolver may also find a different combination of workers to tasks that 
produced the same total score. These differences are due to the nature of 
Evolver's genetic algorithm engine, and they are the reason why Evolver is 
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able to solve a wider variety of problems, and find better solutions (see 
Chapter 7: About Optimization for more information). 


If you would like to save the solution that Evolver produced, but would als 
like to retain a copy of the original spreadsheet, be sure to do a “save as” afte 
you accept, to save the sheet under a new name. Then when you close or qui 
the original, do not save changes. 


Running multiple programs 


Evolver's genetic algorithm technique is very compute-intensive, and require: 
the Excel program to lock in a loop. However, you can break out of that loc 
and have Evolver work while you run other applications by following the 
steps below: 


1. At any time while Evolver is running, press and hold down the "Control" key 
while you hit the "Esc" key. 
Ihis key combination will call up the Windows "Task List" window, or 
whatever task manager you have added to your system (e.g. METZ). 


2. Select the Program Manager 


From Program Manager, you are free to launch any application without 
disturbing Evolver. This technique can be used to run the Evolver Watcher 
standalone program (included with Evolver).to see what Evolver is doing. 


Resetting Evolver 


If you save the "tutorial.xls" spreadsheet, all of the Evolver settings will be 
saved along with the sheet. If you quit Excel and do not save your changes, 
you will get the original Evolver settings the next time you open this 
worksheet. 





NOTE: Any time you call Evolver while in Excel, all of the Evolver settings will 
be saved along with the spreadsheet that was open. The next time that sheet is 
opened, all of the most recent Evolver settings still be stored in Evolver. 






| Applications 
э a s 


Radio Tower Location 


A radio network wants to build three radio towers in a region that has eight 
major communities. Each community has a different population size, and each 
radio tower has a different strength (broadcast range). The goal is to place the 
towers so that the maximum number of potential listeners fall inside the radii 
of the towers. 


A more complicated example of a location problem might be to locate several 
factories so chat they are a) in the vicinity of both vendors and customers, b) in 
affordable, open land, and c) near a large, technically trained work force. Any 
number of additional influences on che best locations, such as tax incentives, 
can also be added to such a model. Evolver can then find the best locations in 
х,у Coordinate space. 


If only one radio tower or one factory needed to be located, the problem 
would be relatively easy to solve; the factory could be moved about until all 
constraints were adequately met. If several or many objects need to be located 
and the location of one affects the location of another, then the problem is 
complex enough to warrant using Evolver. 







Find the best x,y coordinates for three radio towers so that the 
maximum potential listening population falls inside their 
broadcasting range. 

Similar problems: Find sites for warehouses that minimize the shipping necessary 
between warehouses and stores. Locate fire stations so that 
populations are best covered with a limited number of stations, 
including factors such as housing density. 


Example file: location.xls 
Solving method: 













_ тестре 
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How The Model Works 


The file “location.xls” models a two-dimensional landscape where the 
placement of three radio towers determines how many listeners are reached. 
Cells C6:D8 contain the x,y coordinates for the three towers. The graph 
recalculates automatically to show the locations of the towers in a graph, and 
the density of the populations in the neighborhoods. 





Nine communities are represented as single-point locations. Each community 
is considered as either entirely covered (yes) or not covered (no) depending on 
whether its point location falls inside the radius of any one of the three towers 
(located in cell C11, D11, and E11). The total population of all the covered 
communities is calculated in cell O14. 


How To Solve It 





Maximize the population reached in cell O14 by adjusting the tower location 
cells C6:D8. Use the "recipe" solving method and set the ranges for che 
variables from 0 to 50 (see the completed dialogs below). 
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Recipe Variables 
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The "recipe" solving method tells Evolver to adjust che variables chosen in 
any way it sees fit. As is the case with a recipe for baking, we are trying to find 
the right mix of "ingredients" (x,y coordinates) to produce the optimum 
solution. 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 1: Lotek Industries (Getting Started) 

If you have not used Excel 5.0 before, run appropriate examples and demos located under the 
help menu in Excel. Topics relevant to this exercise include: working in workbooks, selecting 
cells, using toolbars, entering data, creating formulas and links, editing a worksheet and 
formatting a worksheet. If you have used Excel 4.0 before, run the Quick Preview under the help 
menu to help acquaint you with the differences between version 4 and version 5. 


You are the president of Lotek Industries, a consumer durables manufacturer. Your board chair 
and CEO, Megan Oldmoney, has asked you to look into an opportunity to acquire a plant that 
manufactures Koolbreez brand home air conditioners. The current owner, Rustbelt, Inc., is 
willing to sell the Koolbreez plant for $2,000,000, and has faxed you a copy of their 5-year 
corporate planning model for the Koolbreez unit. The model, written in Excel version 5.0, is 
shown on the back of this sheet. 


Type the model into Excel. Format the model identically as the one on the back of this sheet. 
There are three sections. The section labeled “Model Assumptions” contains constant parameters 
that are referred to elsewhere in the model. 


The section labeled “Income Assumptions” contains formulas. Market growth increases 0.2% 
annually. Thus, cell C12 should contain the formula =B12+.002. Production cost per unit is 400 
in 1995 then increases 1% per year (e.g., 1996 Production cost per unit = B13 x (1 + $B$7) ). 


The Proforma Income Statement computes revenue, gross margin and income. Compute 1995 
sales as the product of total market times initial market share. 1996 sales are the product of 1995 
sales times (1 + market growth for 1995). Compute sales for subsequent years the same way. 
Price per unit is $520 in 1995. Revenues equal price per unit times sales. Total fixed costs are 
$2,850,000. Both price per unit and fixed costs increase by (1+cost growth) in subsequent years 
(e.g., 1996 Price per unit = B17 x (1+$В$6) ). Cost of goods sold equals production cost per 
unit times sales. Gross margin equals revenues minus cost of goods sold. And finally, income 
equals gross margin minus total fixed costs. All totals use the SUM( ) formula except cell G17. 
This value is the average price over the 5-year period. Use the АУЕКАСЕ( ) formula to compute 
this value. 


1. What if initial market share is 5%? Is income greater than zero for each of the five years? 


2. What is Koolbreez’s total 5-year income if total market size was only 490,000? 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 2: Lotek Industries (Model Revisions) 


Lotek’s strategic planning methodology is to analyze investment opportunities using net present 


value of income. Projects with a positive NPV are worth pursuing; projects with a negative NPV 
are not worth pursuing. Using the NPV function and a discount rate of 15%, find the NPV of the 
project for each year in the five-year period. Insert two lines into the model assumptions section. 
One for the discount rate of 15%; the other for the investment of $2,000,000. Then, create 
named cells for each parameter in the model assumptions section. Once this is complete, the 
formula for B31 should Бе =NPV(Discount_rate,B29)-Investment; the formula for B32 should be 
=NPV(Discount_rate,B29:C29)-Investment, etc. Cell F31 should contain the NPV which 
encompasses the full five-year planning horizon. 


1. Is Koolbreez a good deal from an NPV perspective? (Is cell F31 positive?) 


To aid the usability of the model and interpretation of the results, use an IF statement in cell G31 


to display whether Lotek should invest or not invest. The if statement has the form: © 


=IF(logical test, value if true, value if false). The logical test is a statement that evaluates to 
TRUE or FALSE (e.g., compare whether F31 > = 0 ). The “value if true" could be a numeric 
value, text or a formula. The formula for cell G31 should be: 


=IF(F31>0,"Invest","Do Not Invest") 


You recently had lunch with Buzz Bowtie, Lotek’s VP of Marketing, who told you about an error 
in your planning model. The market share data are based on an advertising campaign but the 
model does not contain advertising costs. Buzz estimates that Lotek’s ad agency, Bucks and 
Morebucks, would charge an advertising fee of $800,000 in 1995 and $400,000 in each of the 
subsequent four years. Add a new line called “Advertising expense” to the income assumption 
section of the model. 


2. Is the Koolbreez investment still a good deal from an NPV perspective? (Is cell F31 positive?) 


To facilitate sensitivity analyses, add a new line called “Market share growth rate” to the model. 
Enter 0.2% as a constant in cell B10. Change the market growth line to reflect the parameter 
value for market share growth rate rather than a fixed number in each cell. These changes will 
help prevent errors in market share growth analysis. 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 


Always attempt the homework prior to the class it is due! 

Homework is for class discussion only. It is not graded. 

Never spend more than 45 minutes on the homework exercise! 

If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 3: Lotek Industries (Graphs) 


Megan Oldmoney, CEO and Chair of the board, has asked for a graph of revenues and income for 


the five-year Koolbreez investment plan. She would also like a projection of income in the year 
2000. Prepare a graph like the one shown below. The added labels show color or added features. 


You begin the task by brushing up on Excel 5.0 graphing features. Examine the topics under the 
examples and demos help menu item in Excel. Three menu topics contain help for graphing data. 
"Creating a chart" contains a basic introduction to graphing. “Formatting a chart" explains how 
to change scale information. “Using charts to analyze data" shows how to add a trend line to 
your graph. You can also use the Workshop III section of the Excel 5 Superbook as a reference 
guide. Page 41 contains help on selecting noncontiguous ranges. Once the three rows containing 
dates, revenues, and income are selected, use the Insert Chart submenu to insert a new sheet in 
your workbook and follow the Chart Wizard prompts to create your graph. 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: | 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can't understand it in 45 minutes, then bring your questions to class. 


Homework 4: Lotek Industries (Slide Show) 


Megan Oldmoney, CEO and Chair of the board, was impressed with the graph and has asked you 


to make a slide presentation to the board of directors. She would like you to highlight the current 
situation with the Koolbreez investment decision and state your recommendation. 


You begin the task by brushing up on Powerpoint 4.0 slide show features. Quick preview on the 
Powerpoint 5 help menu is helpful for getting started. Create a new slide show using a template 
wizard created by an Powerpoint macro. Depending on how your machine is configured 
Powerpoint will open a dialog box for creating a new presentation (or select File New from the 
menu bar). Use the Auto Content Wizard and fill in boxes as prompted. Press next to continue 
from one screen to the next within the Auto Content Wizard. When Auto Content Wizard is 
finished, Cue Cards provide tips for working with Powerpoint. Switch to slide view to edit the 
content supplied by the Auto Content Wizard. Replace graphs and text with material for the 
Homework #4 presentation. Your preliminary notes for the presentation contain the following 
seven slides. 


o Title slide Koolbreez Investment Plan note: add some clip art 

Statement of Today's situation - should we invest? 

Proforma Income Statement 

State model assumptions 

Graph of revenues and income for five-year plan with trend line for the year 2000 


Recommendation 


Co oO > o o do 


Identify action items - what to do next 


To create a slide, copy the desired range of cells or chart from Excel into the Windows clipboard. 
Select Paste to copy the contents of the Clipboard into your slide show. Use the Powerpoint . 
menu bars to select video transition effects. Each slide may have a different special effect. 


The real issue is one of designing visually appealing slides that communicate your message. Use 
borders, pattern fill to add color to your slides. Change the font colors and sizes to capture 
audience attention as well as add legibility for greater viewing distances. Add clip art from the 
Insert Object menu into your slides. The Microsoft Clipart gallery contains hundreds of 
professional-quality commercial artwork that will enhance the appearance of your slide show. 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 


Always attempt the homework prior to the class it is due! 

Homework is for class discussion only. It is not graded. 

Never spend more than 45 minutes on the homework exercise! 

If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 5: Lotek Industries (Sensitivity Analysis) 

Buzz Bowtie, Lotek’s VP of Marketing discussed the Koolbreez advertising plan with Lotek’s ad 
agency, Bucks and Morebucks. Buzz estimates that a stronger ad campaign could increase 
market share for Koolbreez from its current level of 6% to 7% beginning in 1995. This would 
require advertising fees of $1,600,000 in 1995 and $800,000 in each of the subsequent four years. 
Analyze and report the net present value of this alternative advertising campaign. 


The total market of 500,000 units is Rustbelt’s estimate. Since Rustbelt Inc. is selling the 
Koolbreez plant, you are leery of their estimate. You consult with George Guesswell, Forecast 
Analyst with Lotek’s Corporate Planning Staff. George indicates that although the 500,000 
estimate is pretty accurate, depending on economic conditions, it conceivably could vary from 
490,000 to 515,000. Analyze and report the NPV of two alternative scenarios: a best case total 
market of 515,000 and a worst case total market of 490,000. (Assume that this is the only change 
to the base model). 


Megan Oldmoney stops by your office on her way home at 6:30 in the evening: “Га like to 
present our Koolbreez proposal to the Board at 9AM tomorrow, but first I wanted to find out 


what you think about the deal? Also, what is our IRR for the project?” The IRR, internal rate of _ 


return, is the discount rate when the NPV is zero. You know this will be easy to compute using 
the goal seek command in Excel, but how should you respond to Oldmoney’s other concerns? 
Should Lotek invest in the Koolbreez deal or not? If so, do you recommend the ad campaign? 
Which total market value should you use? best case? worst case? both? neither? 


While presenting your recommendations on Koolbreez to the board of directors, board member J. 
Parker Pinstripe III interrupts you in mid-sentence: “It seems to me your analysis assumes that 
unit production costs will only increase at 1% per year. I happen to play golf with the CEO of a 
key supplier for Koolbreez who tells me his company is currently negotiating its labor contract, 
which will probably require them to raise prices. You might want to fold this issue into your 
thinking before we commit on this acquisition." Walking out of the meeting, Oldmoney casually 
mentions to you: "Let me know if you get a chance to look at the impact of that supplier on 
possible unit production cost increases. I’m having lunch over at the Country Club with the boys 
from Rustbelt today, and they're anxious to finalize this Koolbreez deal. If I don't hear from you 
before lunch, I'll assume production costs are a non-issue." 
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“What if" overload has given you a mega-headache. Your stomach is churning and the TUMS 
bottle is empty. How can you manage all these model assumptions and make sense of all these 
analyses? You call George Guesswell but he is not in. His secretary refers you to Molly 
Modelwright a DSS wizard from the Wharton School. “No problem,” says Modelwright, “just 
use the Scenario Manager in Excel. Each scenario can be labeled with a unique name like 
Advertising, Worst total market, Best total market, Production cost, base case, etc. Excel 
prompts you for cells that change in each scenario. Once entered you can quickly view each 
scenario with the Show button or display a summary report showing the effect of changed cells on 
a range of result cells like NPV. Chapter 39 in the Excel 5 Superbook has a good introduction to 
the scenario manager.” You think what a relief, thank Molly and begin to examine the Lotek 
investment with new vigor. | 


Do you still plan to go ahead with the Koolbreez proposal? Why or why not? 
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Homework Rules: 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 6: Lotek Industries (Data Tables) 


Your sensitivity analysis of the five-year Koolbreez investment plan is still not complete. You are 


now interested in examining the affects of discount rate and initial market share on net present 
value. The problem with scenario manager is that you can only see the result of one modification 
at a time. To study the effect a range of values has on a formula and view all numbers 
simultaneously, you need to use a data table. You plan to use a two-way data table (see pages 
446-447 in the Excel 5.0 Superbook). The data table should be on the same worksheet as the 
model. You will vary the discount rate from 8% to 20% and vary the initial market share from 
1% to 10%. Both ranges will be incremented by 1%. The formula is entered in the upper-left 
hand corner of the data table. Format your data table like the one shown below. 


Complete the data table and determine the relationship between discount rate and initial market 
share on net present value. Does the data table provide any new critical insights regarding the 
Koolbreez plant investment decision? 


Initial market share 
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Homework 7: Lotek Industries (Monte Carlo Simulation and Risk Analysis) ] 
Molly Modelwright, the DSS wizard from the Corporate Planning Staff, calls just as you finish the 
data table. She asks how the sensitivity analyses are going. You start griping that there are just 
too many possible combinations of input values to calculate every possible answer and ask her if 
there is a better way to figure out which model parameters are the most important. Molly said, 
“Monte Carlo simulation is an efficient way to handle these types of problems. Instead of 
calculating all possible combinations of input values, Monte Carlo simulation uses a random 
number generator to select a range of possible input values for a model. The result is expressed 
as a range of possible values using descriptive statistics. And besides, Monte Carlo simulation is 
easy to do using the Crystal Ball add-in under the Excel tool menu.” “This is great,” you tell her, 
“How do I get started?” She tells you about two tutorials (see Crystal Ball version 3.0 User 
Manual - handout #12 in the course pack). 


The first tutorial, “Futura Apartments” simulates profit/loss projections from apartment rentals. 
This tutorial is ready to run so you can quickly see how Crystal Ball works. If you work with 
statistics and forecasting techniques, this may be all the introduction you need before running your 
own models with Crystal Ball. The second tutorial, *Vision Research", gives you a chance to 
enter data and set up a complete simulation for a major corporate expenditure decision. As you 
work through the second tutorial, do not worry about making mistakes; recovery is as easy as 
backing up and repeating the steps. 


After completing the tutorials you have learned a lot about building models for making decisions 
under uncertainty. Now it is time to incorporate these new skills in the Koolbreez plant 
investment decision. For the model parameter assumptions, use the uniform distributions in the 
table. Use NPV as the forecast cell. Run the model for 1000 iterations. and create a report of the 
results. There are two big questions to address. First, how certain are you of attaining an NPV 
greater than zero? Second, what model parameters explain the most variance in NPV? 

_ 550,000 













20.0% | 
$1,500,000 | $2,000,000 
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Homework 8: Big Mac Attack 


1 


Download the template homewk8.xls from the student network. It should be identical to 
the one on the following page. Save the file for homework 9-11. 


Study the rows and columns of the worksheet. 

Column A contains a list of items that can be ordered from McDonald’s. 

Column B contains the price of each item. (Note that these prices may not reflect the 
actual price at your favorite McDonald’s, but at the time, these were the actual prices.) 
Column C contains the total calories for each item. 

Column D through column M contain data amount the nutrients and vitamins contained in 
each food item. | 

Column N contains a quantity for each food item. (Note: all cells except N5:N19 are 
locked and password protected. Thus, these are the only values on the spreadsheet that 
may be altered.) | 


Row 21 contains the column totals. The TOTAL row is computed by using the 
SUMPRODUTCT function of Excel which computes the dot product of two vectors. For 
example, total cost is the product of quantity times price. 

B21 is the total cost, $7.81, associated with purchasing 3 Hamburgers, 4 boxes of 
Wheaties, and 3 Milks. : 

C21 is the total calories, 1455, for 3 Hamburgers, 4 boxes of Wheaties, and 3 Milks. 

D21 through M21 represent the total amount of protein, fat, sodium, vitamin А, C, B1, 
B2, niacin, calcium, and iron in that diet. 

Row 22 represents minimum USDA requirements for each of these essential food groups 
or vitamins. 

Row 23 represents maximum USDA requirements for each of these essential food groups 
or vitamins. What does the formula in E23 represent? There are 9 cal/gram ot fat. 


Suppose you decide to purchase all of your food for one day from McDonald's. You are 
extremely concerned with eating a proper diet the meets or exceeds the USDA 
recommended guidelines. However, you only have $20 cash until your next check and 
you would like to have enough money left-over to go out later that night. Change the 
values of cells N5:N19 to find the amount of each item on the menu that will minimize 


cost and still meet all of the USDA requirements? 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: m 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can't understand it in 45 minutes, then bring your questions to class. 


Homework 9: Big Mac Attack LP 


Part 1. Homework 8 used a trial and error technique to find the amount of each item on the menu 
that minimized total cost and still met all of the USDA requirements. Use Excel Solver to 
determine whether you found the correct answer. Solver is on the Tools menu. 


Set target cell 58521 equal to Min by changing cells $N$5:$NS$19 


Subject to the structural constraints that the diet has: 
greater than or equal to the minimum grams of protein 
less than or equal to the maximum grams of fat 
less than or equal to the maximum grams of sodium 
greater than or equal to the minimum percent of vitamin A 
greater than or equal to the minimum percent of vitamin C 
greater than or equal to the minimum percent of vitamin B1 
greater than or equal to the minimum percent of vitamin B2 
greater than or equal to the minimum percent of niacin 
greater than or equal to the minimum percent of calcium 
greater than or equal to the minimum percent of iron 
and the non-negativity constraints that the amount of the food items in N5:N19 must be 
greater than or equal to 0. 


Clicking on *Solve", after entering the constraints of the problem produces a solution and 
prompts whether you would like to generate reports. 


Questions: What is the minimum cost? How much of each food item should be purchased? Is 
the solution realistic? 


Part 2. Part 1 asked to find a diet that minimizes cost. In part 2, find the diet that maximizes cost. 
The maximum cost diet is relevant in determining the cost range of all feasible diets as well 
as for studying palatability, variety maximization, and compromise diets. If the range is 
wide, then there is a large number of feasible diets. If the range is narrow, then there is a 
small number of feasible diets. 


Questions: What is the maximum cost? How much of each food item should be purchased? Is 
the solution realistic? How much more money is this diet compared to cost minimization? 
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Always attempt the homework prior to the class it is due! 
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Homework 10: Big Mac Attack - Sensitivity Analysis 


Part 1. 


Part 2. 


Homework 9 used the Excel Solver to determine the optimum solution to the cost 
minimization problem. Resolve the cost minimization model. Prior to clicking on 
“Solve”, click on Options. Under the Options box click on “assume linear model”. After 
entering all the components to the optimization problem to produce a solution to your 
model, click on “Solve”. Excel displays a dialogue box labeled 


“Solver Results 
Solver found a solution. All constraints 
and optimality conditions are satisfied. 


Excel also allows you to create Answer, Sensitivity, and Limits reports. Hold down the 
Ctrl key and use the mouse to select the Answer and Sensitivity reports. These two 
reports are added as new worksheets to your workbook. If your Sensitivity report does 
not contain “Allowable Increase and Allowable Decrease” for Objective Coefficient and 
Shadow Price, then you forgot to click on “assume linear model” under the options box. 
Use the information in the Answer Report to answer the following questions: 

1. Ifa constraint is binding, how much slack (unused resource) is available? Zero 

2. How many decision variables (adjustable cells) does the model contain? 15 

3. How many binding constraints does the model contain? 15 
4 


. Is there any relationship between the number of decision variables and the number of 
binding constraints? number of binding constraints >= number decision variables 


5. Are all the binding constraints structural? 5 structural; 10 non-negativity constraints 


6. Suppose McDonald’s is out of orange juice. How much would you be willing to pay 
for.one orange juice from another source to help you meet the cost minimization diet? 
Assume it is the same size and quality. Orange Juice is not binding, therefore $0.00 
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Part 3. Use the information in the Sensitivity Report to answer the following questions: 


Shadow Prices (Convert output so that shadow price has 5 decimal places) 


1, 


Why is the shadow price for Total grams protein 0.00000? Non-binding constraints 
do not have a shadow price. 


Do all the binding constraints have a shadow price? Yes 


3. If the USDA adopted a 50% increase in the vitamin C requirement, what is the 


increased cost to the cost-minimizing diet associated with this change? 


$.007 / percent change in vitamin C x 50% = $.35 increase 


Reduced Costs (Convert output so that reduced cost has 5 decimal places) 


4 


McDonald’s adopted a policy of charging $0.10 for honey. What is the affect of this 
policy on the cost-minimizing diet? cost minimizing diet does not use honey, no effect 


How much would the price of orange juice have to decrease, before it would be 
attractive to consider adding orange juice to the cost-minimizing diet? by the amount 
of its reduced cost, $.064 


If you were forced to include one Egg McMuffin in your diet (and adjusted other items 
in your diet to reflect the nutritional value of an Egg McMuffin), by how much would 
the cost of the cost-minimizing diet increase? by the amount of its reduced cost, $.561 


Right-hand side ranging (Allowable increase and Allowable decrease) 


q. 


Suppose you are at risk to coronary problems and your Doctor limits your daily fat to 
40 grams. Is the cost-minimizing diet still acceptable? Why or why not? No, the cost- 
minimizing diet allows 52.5 grams of fat. 40 grams is below this value and outside.the 
RHS lower bound ranging limit. 


Suppose your Doctor puts you on a low-sodium diet and decreases your daily sodium 
limit to 2200 mg. Is the cost-minimizing diet still acceptable? Why or why not? 
Sodium is a binding constraint with the cost-minimizing diet. 3000 mg sodium will be 
consumed; therefore, the diet must change. Shadow prices allow us to determine that 
the new cost-minimizing diet will cost $6.36 because 2000 mg sodium is within the 
RHS ranging limits. -0.000298*800 = $.238 $6.126 + .238 = 56.364 


Objective ranging (Allowable increase and Allowable decrease) 


9. 


10. 


McDonald’s increases the price of hamburgers by $0.08, the price of garden salads by 
$0.20, the price of Wheaties by $0.25, and the price of milk by $0.30. Is the cost- 
minimizing diet still acceptable? What is the cost of the cost-minimizing diet under the 
new pricing policy? Amount of each of the 15 food items does not change. Cost will 
increase by $.08 x 5.299 + $.20 x .535 + $25 x .812 + $.30 x 1.441 = 51.166 ($7.29) 


Suppose the price of garden salad increased by $0.50. What affect does this pricing 
policy have on the optimal solution for the. cost-minimizing diet? Is the cost- 
minimizing diet still acceptable? A 50 cent price increase is outside the upper bound 
for objective ranging, thereone the effect on cost can not be determined without 
resolving. However, the diet composition would change. 
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Always attempt the homework prior to the class it is due! 
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Never spend more than 45 minutes on the homework exercise! 

If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 11: Big Mac Attack Integer Programming 
Part 1. Homework 9 and 10 allowed the amount of each item on the menu to be floating point or 


fractional values. Thus, the model is very unrealistic since it is difficult to purchase 5.3 
hamburgers. Add the constraint that the amount of the food items in N5:N19 must be 
integer values. N5:N19 Int Integer 


Solve the problem with the new constraints. 
much of each food item should be purchased? 





Part 2. Instead er minimizing cost, suppose we are on a diet and need to minimize calories. 





compared with the cost-minimization diet? 


Part 3.Return the model to the cost minimization model in Part 1 above. The cost-minimizing 


diet is boring and the cost-maximizing diet is too expensive as well as boring (Salad, 


yogurt, and Grapefruit juice). Suppose you create a compromise diet subject to all the 
constrains in Part 1 above plus the following constraints: 











One order of Fries with each burger (NS+N6+N7) 
[At least three drinks (milk or juice) 

No more than two Wheaties | 
umber equal number Wheaties 
In addition to any milk taken with Wheaties at least two more drinks (milk or juice) 
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At least one Egg McMuffin or Wheaties 


At least one salad (Chef or Garden salad) 
o more packages of honey than Chicken 
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Homework is for class discussion only. It is not graded. 

Never spend more than 45 minutes on the homework exercise! 
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Homework 12: Braincel Tutorial 


BCDATA.xls contains 17 records of data on loan applications as well as a loan officer's 
indication of the applicant’s ability to repay the loan on a scale of 1 to 5. A “1” means very poor 
loan repayment probability and “S” means excellent loan repayment probability. Other fields for 
each applicant include: monthly income, monthly expense, owns home, years at present job, years 
at previous job, years at present address, years at previous address, and number of dependents. Of 
course, a loan neural network would need a larger database than 17 records. However, the 
tutorial demonstrates how to use Braincel to make loan repayment forecasts based on previous 
applications. 


Enable Braincel by selecting the file braincel.xll from the braincel directory. Complete the 
Braincel tutorial on pages 158-165 of the course pack (Part A). 


Homework 13: Evolver Tutorial 


Tutorial.xls contains a spreadsheet describing a resource allocation problem where each worker 


must perform one task. The table rates how well each of ten workers performs each of ten tasks. 
Each worker’s ability to perform each task is rated on a scale of 0 to 10 (0=сап not до the task, 
10=perfect at the task). The optimization problem is to match each worker to a task so that 
overall productivity is maximized. The “Chosen Task” (column N) to the right of ratings assign 
each worker arbitrarily to one task. The next column (column P) enters each worker’s rating for 
that task. Finally, the “Total Score” is the sum of the individual ratings for all 10 tasks. 


There are 10! (10x9x8x7x6x5x4x3x2x 1=3,628,800) possible ways to assign workers 
to their tasks. Although this problem may seem simple, the problem’s complexity increases 
exponentially as we add additional constraints (e.g., a certain task requires two workers). 


Using Evolver to solve the problem is similar to using Solver. Evolver is on the Tools menu. 
Load Evolver and solve this problem by following the instructions in the tutorial on pages 173- 
184 of the course pack (Part A). 
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Never spend more than 45 minutes on the homework exercise! 
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Homework 14: Database Forms and Complex Queries Using Data Tables 

This exercise provides practice working with Excel 5.0 commands used to manipulate databases 
or lists. Download the template homewk14.xls from the student network. This file contains the 
OPIM 101 class list (except for names and social security numbers). This list contains the field 
names: Order, Section, College, Major, and ID Code. Each field is a column and the columns 
contain field values (e.g., the field “Section” has values of 1, 2, or 3). Each record in the list is a 
row that contains a unique collection of field values. This list does not contain duplicate records. 


Data Forms are a dialog box used to add, delete, edit, and find records in a list. Copy the field 
names to cells A13:F13 and define the named range Criteria as $A$13:$F$14 and Database as 
$A$23:$F$290. Select form under the Data menu to open the data form dialog box. Add a new 
record for student 300. | | 


UNDC| XYZ 






After adding the record, examine the named range “Database”. Excel automatically adjusts the 
list range to include new records in the named range. It also prevents you from overwriting 
existing records on your spreadsheet. 


Now delete the record you just added. Again note that Excel automatically adjusts the list range 
to delete the record from the named range “Database”. 


Use the criteria feature to find out how many marketing students are enrolled in OPIM 101. 
Select the criteria button then enter MKTG in the field for majors. The data form dialog box 
displays the first record found. The Find Prev and Find Next buttons can be used to navigate 
through the selected records. 


Find number of marketing students enrolled in OPIM 101. [| 


Use the criteria fields to complete а more complex search. Find number of students who аге 
juniors in the College with a declared major ( not undecided <> UNDC) enrolled in OPIM 101. 


Е : 


Find all freshmen with majors declared as undecided and all Wharton seniors. [| 
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The criteria range in the data form dialog box can not handle complex queries. Use database 
functions and named criteria ranges for more complex queries. “Find all freshmen with majors 
declared as undecided and all Wharton seniors” is a compound query. To specify multiple criteria 
for a single query use multiple lines in the criteria range. One line specifies freshmen (FR) 
undecided (UNDC); the other specifies Wharton (WH) senior (SR). Copy the field names to the 
cell range A2:F2 and complete the query. Use the database function 
=DCOUNT(Database,"Order",A2:F4) to find how many students іп ОРІМ 101 fit these criteria? L] 


Change the criteria range to C2:D4. What does this value represent? How many students in 
OPIM 101 fit these criteria? 0 


In addition to multiple lines, criteria ranges may contain multiple copies of the field names. 
Suppose you wanted to count all the records in the OPIM 101 class list with order value greater 
than 100 but less than or equal to 200. Copy the field names to the cell range A8:F8 and repeat 
the field name Order in cell G8. Use the values >100 and <=200 under the field names “Order” 
and the database function =DCOUNT(Database,"Order",A2:F4) to count the records that fit 
these criteria. Search for the topic “and/or criteria range” using Excel help to find additional 
examples. 


Data Tables Use a data table to find the number of freshmen, sophomores, juniors and seniors 
enrolled in OPIM 101 from each college. The named ranges “Database” and “Criteria” will make 
it easier to build a Data Table using database functions. Enter the data table below the named 
range Criteria as shown below. Cell А15 contains the database function formula 
-DCOUNT(Database,"Order",Criteria) where “Order” is the field name of the values to be 
counted. Use the help feature in Excel to see examples of other database functions (e.g., 
DAVERAGE, DMIN, DSUM). 


ТАГ в T с | PrpapchH apaq 
— 13| Order| Section | College | Class | Major | ID Code 
IA. I NAR HO UIT siaina, 
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Always attempt the homework prior to the class it is due! 
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Never spend more than 45 minutes on the homework exercise! 
If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 15: Database Sorting, AutoFilter, Subtotals, and Pivot Tables 
This exercise provides practice working with Excel 5.0 commands used to manipulate databases 
or lists. Copy the OPIM 101 class list from homework 14 to a new worksheet. 


Sorting a list facilitates viewing groups of records in the list. Sort the records in the OPIM 101 
class list by College, Class, and Major. Record 11 should be the first row in the list after sorting. 


Using the sorted list, obtain subtotals by College and Class using the Subtotal Data command. 
Subtotals must be executed twice. Select subtotals for College first. Then select subtotals for 
class. Be certain the box "Replace Current Subtotals" is not selected otherwise Excel will remove 
the previous subtotal for College. Click on the subtotal box 3 to collapse the record detail and 
show summary data for each group. Do these subtotals agree with the values in the Data Table 
from Homework #14? 


Remove all subtotals by selecting the "Remove АП" box in the subtotal dialog box. Select a cell 
in the OPIM 101 class list. Select the Data | Filter | AutoFilter command. Excel adds drop-down 
arrows to the cells containing the field names. Clicking on the “Major” field displays all the 
majors for all students in the class. By selecting “MKTG”, only records of students with a 
marketing major will be displayed. How many records meet this criteria. Does this value agree 
with the value found in homework 14? 


AutoFilter allows more complex filtering of record criteria by using multiple fields with custom 
sorts. Use the criteria fields to complete a more complex search. Find number of students who 
are juniors in the College with a declared major ( not undecided <> UNDC) enrolled in OPIM 
101. Does this value agree with the value found in homework 14? [0 


Define a criteria range and use the Data | Filter | AdvancedFilter command to find all freshmen 
with majors declared as undecided and all Wharton seniors. Does this value agree with the value 
found in homework 14? D] 
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Use the Pivot Table Wizard to create a Pivot Table like the one below. Use Section as a page 
value, College as a row value, Class as a column value and count of ID Code as a data value. By 
selecting different page views, the Pivot Table displays the number of freshmen, sophomores, 
juniors and seniors enrolled in OPIM 101 from each college for each section or for all sections 
combined. 


Section U | 


| T of ID Code [Class | —— 1 . . .- 
FR JR SO SR [Grand = 


m 


ICOL 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: | 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 16: Visual Basic object properties, ranges, input boxes, message boxes 

This exercise provides practice working with Excel 5.0 Visual Basic. The Visual Basic program 
will be written on the module sheet, named homework16; and a worksheet, named HW16, will be 
used to display portions of the program output. Write a Visual Basic program to prompt you for 
your first name and last name. Next, a message box asks whether you have entered these fields 
correctly. Finally, output and format the values in cells A1:A4. 


First, familiarize yourself with the syntax for the commands used in this program by reading the 
following pages of the Excel 5.0 Superbook. [ InputBox (pp. 723-725), MsgBox (pp. 720-723), 
Range( ).formula (p. 710), as well as commands for selecting cells (chapter 53) and recording 
macros (pp. 653-658) ] Often many of the actions in a macro can be recorded using the Record 
Macro feature. Once recorded, these actions can be copied from one module sheet to another. 


Begin the module with the statement Option Explicit (p. 683). This Visual Basic command will 
cause Visual Basic to generate an error message if variables are not declared prior to their use in 
the execution of the program. This is very useful for writing good code. 


The elementary unit of Visual Basic programming is called a procedure. [Note: the terms macro 
and procedure are equivalent] Begin your program by defining a sub procedure call 
homework16. Declare the variables lastname, firstname, fullname as String and answer and i as 
Integer. 


Use two InputBox commands to prompt the user for their first and last name. The InputBox 
command creates a dialog box to prompt the user for input. The syntax for this method is: 
firstname = InputBox(prompt:="Enter your first name:", Title:—"Input Box") . where 
firstname is a variable declared as a String to accept the input from the InputBox command. 


The concatenation operator & can be used to join the first and last name into the variable 
fullname. Insert appropriate spaces and a question mark to use the concatenated name in the 
message box prompt that asks the user whether the first and last names were entered correctly. 
The syntax for this method is: answer = MsgBox("Is your name " & fullname, 3, "Verify 
Name") The 3 is used to obtaina Yes No Cancel type of message box. 


Use the record macro feature to record the Visual Basic commands to select a worksheet named 
HW16 and clear the contents of the entire sheet. Copy these commands into your homework 16 
module. Next use the Range(“A1”).formula properties to assign the contents of the variables 
firstname, lastname, fullname to cells Al, A2, and A3 and put the answer the message box 
question in cell A4. Finally, use the record macro feature to record the Visual Basic commands to 
autofit the column width and make А1 the active cell on the worksheet HW16. Copy these 
commands into your homework 16 module. 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 


Always attempt the homework prior to the class it is due! 

Homework is for class discussion only. It is not graded. 

Never spend more than 45 minutes on the homework exercise! 

If you can’t understand it in 45 minutes, then bring your questions to class. 





Homework 17: Visual Basic control statements ` | 

This exercise provides practice working with Excel 5.0 Visual Basic. The Visual Basic program 
will be written on a module sheet, named homework17 and a worksheet, named HW17 will be 
used to display portions of the program output. The program computes the total electricity cost 
given kilowatt hour usage (kwh). Rates for electricity are shown in the table below. Use nested if 
then else statements or the select case statement to assign the appropriate rate. Use an InputBox 
to prompt the user for total kilowatt hour usage. Compute the total electricity cost. Display the 
total kilowatt hour usage, rate and total electricity cost in three successive rows of column A. 
Use a message box to ask the user whether to continue the calculations. If so, use a GoTo 
statement to transfer control back to the beginning of the input section of the program. Continue 
to display the total kilowatt hour usage, rate and total electricity cost in three successive rows of 
column A with a blank row between each output area. Do not erase previously displayed values. 


First, familiarize yourself with the syntax for the new commands used in this program by reading 
the following pages of the Excel 5.0 Superbook. [ If statements (pp. 689-693), select case (pp. 
693-694) ] Select case is much easier to use to solve this problem, but you may use either one. 
Begin the module with the statement Option Explicit (p. 683). Define a sub procedure called 
homework17. Declare the variables kwh, rate as Single; answer, i as Integer. Use i as a counter 
for rows. Initialize the value of ito 0. Create a line label for transferring control using a GoTo 
statement. Line labels are not case sensitive, but they must begin with a letter and end with a 
colon. Each line label must be uniquely named. Use an InputBox command to prompt the user 
for total kilowatt hour usage. Assign this value to the variable kwh. Next, create the nested if 
then else statements or the select case statements to assign the appropriate electric rate. 


Use the record macro feature to record the Visual Basic commands to select a worksheet named 
HW17 and clear the contents of the entire sheet. Copy these commands into your homework 17 
module. Next, use the Range(“$A$” & 1).formula properties to label values in column A and put 
the contents of the variables kwh, rate, and kwh*rate to cells B1, B2, and B3. Finally, use the 
record macro feature to record the Visual Basic commands to autofit the column width and 
format the values in cells B1, B2, B3. Copy these commands into your homework 17 module. 
Use a message box and GoTo statement as described above to either exit or continue. 


1808 | 2000 < = 4000 | 
2000 4000; | | 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 


Always attempt the homework prior to the class it is due! 

Homework is for class discussion only. It is not graded. 

Never spend more than 45 minutes on the homework exercise! 

If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 18: Visual Basic loop statements 

This exercise provides practice working with Excel 5.0 Visual Basic. The Visual Basic program 
will be written on a module sheet, named homework18 and a worksheet, named HW18 will be 
used to display portions of the program output. The program simulates a coin toss. If the toss 
comes up heads, then the trial continues. If the toss comes up tails, then the trial ends. The 
macro calculates the maximum number of consecutive heads attained over a certain number of 
trials. 


First, familiarize yourself with the syntax for the new commands used in this program by reading 
the following pages of the Excel 5.0 Superbook [ For loops (pp. 696-698), while loops (pp. 695- 
696) ]. Begin the module with the statement Option Explicit (p. 683). Define a sub procedure 
call homework18. Declare variables for the macro as well as the constants head=0 and tails=1. 
Use an InputBox command to prompt the user for number of trials. Next begin a for loop 
incremented from one to the number of trials. Set the counter for the number of consecutive 
heads to zero. Use the following if statement to simulate a coin flip within the body of the for 
loop. | If (Клао < 0.5) Then 
flip = head 
consecutive heads = consecutive heads + 1 
Else 
flip = tail 
End If 
Use a while loop to continue flipping a coin while the flip is equal to heads. Count the number of 
consecutive heads. Outside the body of the while loop, determine whether the number of 
consecutive heads is greater than any prior number of consecutive heads attained. If so, update 
the maximum count. Continue to the next trial within the for loop. 


Use the record macro feature to record the Visual Basic commands to select a worksheet named 
HW18 and clear the contents of the entire sheet. Copy these commands into your homework 18 
module. On worksheet HW18, label cell Al “number of trials”. Put the value for the number of 
trials in cell B1. Label cell A2 “number of consecutive heads”. Put the value for the number of 
consecutive heads in cell B2. Label cell A3 “observed probability”. Put the value for observed 
probability in cell B3 [number of consecutive heads / number of trials]. Label cell A4 “actual 
probability". Put the value for actual probability in cell ВА [ 0.5^ ™*"*“] where actual probability 
= 0.590106 of consecutive heads) Finally, use the record macro feature to record the Visual Basic 
commands to autofit the column width and format the values in cells А1:84. Copy these 
commands into your homework 18 module. | 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
‘If you can’t understand it in 45 minutes, then bring your questions to class. 


Homework 19: Visual Basic loop statements 


This exercise provides practice working with Excel 5.0 Visual Basic. The Visual Basic program 
will be written on a module sheet, named homework19 and a worksheet, named HW19 will be 
used to display portions of the program output. The program simulates a coin toss. If the toss 
comes up heads, then the trial continues. If the toss comes up tails, then the trial ends. The 
macro calculates the maximum number of consecutive heads attained over a certain number of 
trials. | 


First, familiarize yourself with the syntax for the new commands used in this program by reading 
the following pages of the Excel 5.0 Superbook [ For loops (pp. 696-698), while loops (pp. 695- 
696) ]. Begin the module with the statement Option Explicit (p. 683). Define a sub procedure 
call homework19. Declare variables for the macro as well as the constants head=0 and tails=1. 
Use an InputBox command to prompt the user for number of trials. Next begin a for loop 
incremented from one to the number of trials. Set the counter for the number of consecutive 
heads to zero. Use the following if statement to simulate a coin flip within the body of the for 
loop. If (Rnd() < 0.5) Then 
flip = head 
consecutive heads = consecutive heads + 1 
Else 
flip = tail 
End If 


Use a while loop to continue flipping a coin while the flip is equal to heads. Count the number of 


consecutive heads. Outside the body of the while loop, determine whether the number of 


consecutive heads is greater than any prior number of consecutive heads attained. If so, update 
the maximum count. Continue to the next trial within the for loop. 


Use the record macro feature to record the Visual Basic commands to select a worksheet named 
HW19 and clear the contents of the entire sheet. Copy these commands into your homework 19 
module. On worksheet HW19, label cell Al “number of trials”. Put the value for the number of 
trials in cell B1. Label cell A2 “number of consecutive heads”. Put the value for the number of 
consecutive heads in cell B2. Label cell A3 “observed probability”. Put the value for observed 
probability in cell B3 [number of consecutive heads / number of trials]. Label cell A4 “actual 
probability". Put the value for actual probability in cell B4 [ 0.5^ ™*"*“] where actual probability 
= 0. sumber of consecutive heads) Finally, use the record macro feature to record the Visual Basic 
commands to autofit the column width and format the values in cells A1:B4. Copy these 
commands into your homework 19 module. 
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Homework Exercise 
OPIM 101 Introduction to the Computer as an Analysis Tool 


Homework Rules: | 
Always attempt the homework prior to the class it is due! 
Homework is for class discussion only. It is not graded. 
Never spend more than 45 minutes on the homework exercise! 
If you can’t understand it in 45 minutes, then bring your questions to class. 





Homework 20: Visual Basic - arrays 

This exercise provides practice working with Excel 5.0 Visual Basic. The Visual Basic program 
will be written on a module sheet, named homework20 and a worksheet, named HW20 will be 
used to display portions of the program output. The macro converts a dollar amount in cents into 
the maximum number of half-dollars, quarters, dimes, nickels and pennies that yield the equivalent 
dollar value. For example, the table below shows that $1.43 is equal to two half- dollars, one 
quarter, one dime, one nickel and three pennies. 





Sub homework20() 
Dim initial amount As Integer 
initialize 
initial_ amount = amount ' 
final formatting (initial amount 
End Sub 


Begin the module with the statement Option Explicit (p. 683). Define a sub procedure call 
homework20. Declare initial amount as integer for the macro. Call a subroutine initialize, a 
function amount and a subroutine final formatting. The subroutine, initialize, activates worksheet 
HW20 and clears the contents of the worksheet. The value initial amount is passed to the 
subroutine, final formatting. This subroutine autofits and formats the cells as shown in the table 
above, then the label ‘Amount of change in cents:’ is entered in cell C1 and the value in cell D1. 


In the function amount, dimension an array named table and initialize the values to the value of 
each of the five coins. Use an InputBox command to prompt the user for the value of the change 
in cents. Use a for loop to compute the number of coins of each value that equals the initial 
amount in cents. Compute number of coins and amount as shown below: 


quantity — Int(amount / table(index)) number of coins calculation 


amount — amount Mod table(index) modulus operator returns remainder of division 
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Spreadsheet Limitations 





ш Spreadsheet limitations 
* Each constant parameter can only assume one value at а 
time 
* “What-if analysis always results in a single point estimate 


= Monte Carlo simulation 
* Allows a range of possible input values 
* Statistical summary of outcome possibilities 


WHARTON REPROGRAPHICS C^ 





Monte Carlo Simulation | Wharton ___ 
OPIM 101 | The Wharton School 
ott Unions йай 


| Deterministic versus Probabilistic 





| ш Deterministic . m Probabilistic 
* Makes use of average * Admits to uncertainty in 
or "estimated" values projected outcomes 
for poon outcomes - Takes this quantified 
ee а Мы = uncertainty into account 
uncertainty in the subsequent 
analysis 
* Example: y ie 
How many units will we ° EN EP ae 2 
sel next year? | куш" will we 
estimate abou | 
10,000. l'd say somewhere 
between 8,500 and 


12,000. Any value in that 
range has an equal 
chance of occurring. 
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Risk Analysis in Capital Budgeting 









= How large is the initial market? 
* Expected value 250,000 


m How much can we charge for the product? 
* Expected value $510 


m How fast is the market growing? 
* Expected value 3% 


m What will our market share be? 
* Expected value 12% 


{J Monte Carlo Simulation (Q harton 
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Risk Analysis in Capital Budgeting 





ш How large is the initial market? 
* Expected value 250,000 
- Range 100,000 - 340,000 
m How much can we charge for the product? 
* Expected value $510 
* Range $385 - $575 
m How fast is the market growing? 
* Expected value 3% 
* Range 0.5% - 6.5% 
m What will our market share be? 
* Expected value 1296 
* Range 396 - 1796 
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Common Probability Distributions 


erent erate 





OPIM 101 | The Wharton School 





НАМО( ) Returns an evenly distributed random number greater than 
or equal to 0 and less than 1. A new random number is returned 
every time the worksheet is calculated. 


| Remarks To generate a random real number between а апа b, use: 
RAND()*(b-a)*a 
If you want to use RAND to generate a random number but don't want 
the numbers to change every time the cell is calculated, you can 
enter =RAND() in the formula bar and press F9 (or COMMAND + = in 


Microsoft Excel for the Macintosh) to change the formula to a random 
number. 


e To generate a random number greater than or equal to 0 
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Behavioral Decision Making 


People make decision every day 
° Should | read the Primis book for OPIM 101 class? 
• Should we launch the space shuttle Challenger? 
* What kind of car should ! buy? 
* What apartment should | rent? 
Three types of Decisions: 
* Choices are well-defined alternatives to select 
> Evaluations determine an amount (in $) to bid 
3 Judgments are difficult to define an optimum or 
correct value (e.g. apartment selection) 


| Decision Making Wharton — 
‘The Wharton School 


OPIM 101 
of the University ој Pennsytoania 
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ш Noncompensatory rules 

* Conjunctive rule - a strict cutoff is made for the salient dimensions 
(е.д., price < $13,000 and MPG > 20; cars B, C, & D are not 
considered) 

* Disjunctive rule - if one salient dimension meets the cutoff value, 
the choice is still considered even though another factor is above 
the cutoff. (e.g., price « $13,000 or MPG » 20; only car D is not 
considered) 










Decision Making 
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— Process Models of Choice 


z | | TmCompensatryrues —  — rules | 
* Trade-offs are made among the cutoff criteria (e.g., for each 
additional safety unit (above 5), | will spend $1000 more for the car. 
If safety < 5, then price <= 14,000 


If safety = 6, then price <= 15,000 
If safety = 7, then price <= 16,000 
If safety = 8, then price <= 17,000 
lf safety = 9, then price <= 18, 000 
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Heuristics and Biases 





m Availability over-estimate well-publicized events (probability 
of death from tornado versus lightening; more likely that word 
begins with “R” or has “r” as third letter of the word). 


m Representativeness insensitive to base rate probabilities 
(farmers vs. NASA pilots); misconceptions of chance (HTHTTH is 
more random than HHHHTH) 


ш Anchoring and adjustment insufficient adjustment from 


the anchor (initial ballpark estimate affects subsequent estimate) 


m Overconfidence bias (OPIM 101 grades; driving skill - 
69% of Sweden & 93% of USA drivers felt above average driver) 


m Selective perception seek information consistent with 


their own view & fail to seek information to disconfirm their view 
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Heuristics and Biases 





B Hecency give more attention to information heard last (ignore 
facts heard first) 

m illusion of control people pay more for a lottery ticket that 
they pick the number for than if randomly assigned by computer 

m Utility theory risk averse in gain situations but risk seeking 
in loss situations 

ш Time pressure people generally make poorer quality 
decisions under stress conditions due to "satisficing"; also less 
risky under high time pressure 

m Group think social pressures of the minority can unduly 
influence a majority 
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Human Information Processing 
Limitations 


ш Inability to manage decision process information 
• STM constraints and unreliable recall of information 

ш Difficulty in combining attributes or objectives 
* STM limitations and slow numerical calculations 

a Inability to systematically search for “optimum” 
• Time constraints, slow numerical calculations, and satisficing 

ш Inaccuracies and biases in heuristic judgments 
• Mental quantitative judgments are difficult 
* Unreliable recall of information 

= Generating and good problem representation 
* Unable to create a mental representation 
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Decision Models to Support Human 
| Information Processing Limitations 


| m inability to manage decision process information 
* DBMS can store, organize and retrieve data, but DM needs help 
| keeping track of where they are in the DM process 
m Difficulty in combining attributes or objectives 
* mathematical rules for combining attributes of decision outcomes 
can be used to aid the consistency of the DMs judgment policy 


m Inability to systematically search for “optimum” 
• Statistical, DSS, Al tools help ease limited time constraints 
m inaccuracies and biases in heuristic judgments 


* DM awareness of their biases through feedback can help reduce 
the problem (e.g., bonuses for weather forecasters) 


m Generating and good problem representation 
* Visual representation to aid problem solving 
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Models of Decision Making 


ш Rational view 
* Theorems (Utility Theory) and optimal choice 
* Complete, perfect, and instantaneous information 
ш Intuitive view 
* Experience and human expertise 
* Verbal, experimental and historical information 
ш Satisficing 
* Rules of thumb (heuristics) 
• Sees the world as it is rather than as it should be 
* Uses a limited amount of information 


* Generates a few good alternatives, picking one that is good 
enough 
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= ГП] и Романи ~ еләр 

= Expected value 

= Decision trees 

= Bayesian analysis and decision trees 
= The value of information 


Decision Analysis (Wharton 
OPIM 101 | | ‘The Wharton School 
of the University of Penaryfvania 





ш Expected monetary value 
EMV =1/6 x $2 + 1/6 x $5 + 1/6 x $5 + 
1/6 x $10 + 1/6 x $10 - 1/6 x $20 
=$12/6=$2 
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Expected Value 


= Should we advertise the new product if the 
үну” 0.2 апа Рр)“ 0. ре 








| Flop — 
г Т Advertise $15:000 r | ө - 53000 
| Do not аный za 53, 000 га — 00 _ 









' Actioni Payoffi1 Payofft2 | 
| Action2 Payoff21 _ Payoff22 | I 





m EMV(Advertise) = .2 x 15,000 - .8 x 3,000 = $600 
EMV(No Advertise) = .2 x 3,000 - .8 x 100 = $520 
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Expected Value 


A contractor has been invited to bid on a 
construction job. The value of the contract 
depends on the length of time it takes to 
complete the project. If the project is finished on 
time, there is a profit of $50,000. If the project is 
finished late, the contractor will lose $10,000. 
Finishing late is solely dependent upon the 
weather. If the weather is good, the project will 
be finished on time. If the weather is bad, the 
project will be late. The contractor’s subjective 
probability for good weather is 20%. 


Should the contractor bid on the project? 
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| . Decision | ___ Бос d Weath er " Bad Weather | 
= Bid T = $50, 000  -510, 000 — 
= ~ Do not Bid _ 43,9 F 

Г Probability T _ 20% _ 80% 





m EMV(Bid) = .2 x 50,000 - .8 x 10,000 = $2,000 
EMV(No not bid) = .2 x 0 -.8 x 0 = $0 


m Therefore, bid on the project 
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Decision Tree 


20% Good Weathe 
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Decision Tree 


| Alternative B Probability branch 
20% Gopd Weathe | $20,000 | 


80% Had Weather (710.000) 








[Bid] $2,000 
20% Good Weathe[ —— $07 
|| No Bid {30 


= 80% Bad Weather Е: 
Decision node 9 2090: WCET: [эз ЭР 


Chance node 
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[ | $0 If p(good weather) » .1667, then bid; else do not bid 
i 16.67% Good Weather bÚ 











No 3id 
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Bayesian Analysis and 
Decision Trees 






The contractor has an opportunity to buy a long- 
range forecast from an independent weather- 
forecasting company. The weather-forecasting 
company has a fairly good track record for these 
long-range forecasts. They successfully 
predicted good weather 70% of the time and 80% 
of the time it successfully predicted bad weather. 
The cost of the forecast is $5,000. 








р Good ы ort: ore Lf 7 ERE ; j | 





| Predict Bad | .3 = SA | 
Decision Analysis Wharton 
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Fhe Cleese che 


Bayesian Analysis and 
Decision Trees 











Р(ргейсї bad | bad) 0.80 
Fipredict bad | good) 0.30 
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|" Analysis and 
Decision | | _ Decision Analysts 


|, = ë Siedictión of good weather 
и = prediction of bad weather 
S, = good weather 


S, = bad weather 
Р(І, | S4) = Р( | S.) = 
P(I,| S, ey ie ees 8 
P( 1, ) = Р(І, | S4) P(S,) + Р(1, | S;) P(S;) 
= (.7)(.2) + (.2)(.8) = .30 
P(L)z 1:0 - .30 = 7 
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Bayesian Analysis and 
Decision Analysis 





Forced Good — 30.00 [ Bal [$13,000 
Y 46.7% Good Weather TESA] 
|. Ne Ba TSS 000T - 
Buy forecast ош 
| ДА uA 
да соса Еко) 
Forecast Bad ТОР Ро Big 55 000) 


1 4.0% Good Weather| TSAR] 
| No Didj (25 КАП — | 


Н fiat анти [ТЛ] 
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Bayesian Analysis and 
Decision Analysis 


: i | и LJ 
GLL 24 Sep 94 
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Bayesian Analysis and 

Decision Analysis 

~ Contract Bid Decision Tree 
GLL 24 Sep 94 · 
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III Value of Information 
ш No forecast EMV 
| * Expected monetary value without sample information 
| • [(.2)(550,000) + (.8)($10,000) ] = $10,000 - $8,000 = $2,000 





m Forecast EMV 
* Expected monetary value with sample information 
* Bayesian decision analysis 
* $400 
* Forecast EMV = $0 Тр(аоод weather) = .186 


m EPPI 
* expected profit with perfect information 
* always bid if good weather & never bid if bad weather 
* [(.2)($50,000) + (.8)($0) ] = $10,000 
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Value of Information 





* Expected value of a perfect weather forecast 

* EPPI minus no forecast EMV 

* [(.2)(550,000) + (.8)($0) ] - $2,000 = $8,000 

* EVPI is the upper bound on the amount you would be willing 
to pay for a perfect weather forecast 

m EVSI 

* Expected value of sample information 

* EMV with sample information - EMV with no information 

* Ifthe cost of sample information has been subtracted from 
the payoff, then this amount must be added to the EMV with 
sample information 

* [$400 + $5,000] - $2,000 = $3,400 

* |f price of weather forecast is negotiable, then this is most 
you are willing to pay for sample information 


_LJ Decision Analysis = (Wharton 
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| Value of Information 
je, 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 
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Value of Information and the Probability of Good Weather 


«so 2550165 forecast of good weather is .3 and bad weather is .7 













$40,000 
$30,000 
$20,000 


$10,000 - 
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Linear Programming 


= A mathematical technique for optimizing the 
use of resources given: 
* an objection function expresses relationships among resource 


alternatives 
(e.g., minimize costs or maximize contribution) 


* available resources are in limited supply 
* altemative solutions could achieve the objective 
* relationships must be expressed as linear equations or 
inequalities 
= Commonly used for 


° mixture problems (Mrs. Fields Cookies, portfolio design, 
advertising in publications, manufacturing) 


* routing problems (warehousing logistics, transportation) 


Linear Programming I 
OPIM 101 









Lin 


Champion Sports manufactures two types of custom men's 
underwear: boxers and briefs. 
Briefs use 0.5 yards of material; boxers use 0.4 yards. 300 
yards of material are available. 

Each boxer uses 1 insignia logo and 600 insignia logos are in 
stock. 

It requires 1 hour to manufacture one pair of boxers and 2 
hours for one pair of briefs. 900 labors hours are available. 
There is unlimited demand for boxers but total demand for 
briefs is 375 units per week. 

The contribution per boxer is $3.00 and the contribution per 
brief is $4.50. 

What mix maximizes contribution? 





ear Programming Example 
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| What if labor increased by 100 hours? 
. the feasible region expands. 
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Linear Programming Issues — 








ө Extreme points 
| A corner of the feasible region formed by intersecting 
constraints. The solution is always at an extreme point. 


e Infeasibility 
There is no solution that satisfies each and every constraint. 
There is no feasible region. 

e Unboundedness 
The model has not been correctly formulated since the 
objective function can go to infinity without violating any 
constraint. Often missing a constraint. 

e Redundancy 


A constraint that does not affect the feasible region. Deleting 
redundant constraints enhances computational efficiency. 
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lm Excel Formulation 





| Linear Programming I = Wharton 
OPIM 101 | The Wharton School 
тутана RUBEN 
|| Excel Solver 
mre I I Iaa 
| 


1$9:$C$9 >= Ü 
Demand Briefs <= $F$7 
Hours Labor «- $F$5 
Logos «- $F$5 
Yards Material «- $F$4 
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|| Excel Solver 








| | Linear Programming | = Wharton 
ОРІМ 101 - Е u The Wharton School 


of the University of Pennsylvania 


um Excel ila 





m Press the Ctrl key and use the mouse to select 
both the Answer and Sensitivity Reports 
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| Excel Solver 


I Í Ти The SUM ae, Т Solver The Solver could not find a feasible solution d The Solver could not find a feasible solution 
* This is usually an Unboundedness or infeasibility problem. 
m The maximum iteration or time limit was reached: 





continue anyway? 


* The warning préveintà infinite computer time for an unsolvable 
model. A Solver Options button allows you to increase these limits. 


If the Solver solution does not look correct, 


(e.g., 2 decision variables and 1 binding constraint) 
* Try increasing the tolerance to 196 under the Solver Options Menu. 


| Linear Programming UN harton 
of the University of Pennsylvania 







Excel Solver 





ш Assume linear model option 
* Greatly speeds computation if model is linear 
* Generates error message if model is not linear 
* Must specify this option for right-hand-side and objective ranging 


* |f not selected, Solver generates a Sensitivity Report using 
Reduced Gradient instead of Reduced Cost and Lagrange 
Multiplier instead of Shadow Price 
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|| Excel Solver Answer Report 
| Microsoft Excel 5,0 Answer Report 
| Worksheet: [9 4BOXE ers Base Case 
Report Created: 8/24/94 9:22 
Target Cell (Max)Optimum obj function value 
Cell Name Original Value Final Value 
50510 Margin Total $2,400.00 — $2,400.00 
| Adjustable Cells Optim decision varia 
Cell Name Original V 














si ible values 
-— alue Final Value 
$B$9 Boxer Production 500 500 
$C$9 Brief Production 200 200 
|| Constraints | | | onstraint 
Cell Name Cell Value Formula Status Slack 
3084 Yaris Material — 300 SDS4<=SFS4 Binding 0 
ae лема 





$D$6 Hours Labor . 900 $DS6«-SFS6 Bi _____0 
5057 .Demand Briefs 200 $D$7«-5F$7 Not Binding 175 






OPIM 101 














Cell Name Cell Value Formula Status — Slack 
$D$4 Yards Material _ 300 $D$4<=$F$4 Binding 0 
$D$5 Logos ...500 $D$5<=$F$5 Not Bindi 0 
$D$6 Hours Labor —  — 900 $D$6<=$F$6 Binding 0 
$D$7 Demand Briefs 200 $D$7<=$F$7 Not Binding 175 
$B$9 Boxer Production 500 $B$9>=0 Not Binding 500 





$C$9 Brief Production 200 $C$9>=0 — NotBinding 200 
ш Status of constraints 

• binding constraints never have slack or surplus 

* not binding constraints always have slack or surplus 

• not satisfied (incorrect specification of the constraints) 

9 number of binding constraints >= number of decision variables 


m Slack measures unused available resources 
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aaa Ш Interpreting LP Solutions 
| ш Shadow bue се - value of 1 additional unit nito that resource 
== unit койа» 6 i li сейл 
ш Right-hand-side ranging 





upper and lower boundary range over which shadow prices 
are valid. Multiple RHS changes are possible. 


Ж Reduced cost- value of not using 1 unit of that resource 





unit increase of that non-basic variable (huac) 


| Linear Programming _ | ——— Wharton 
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Ж Objective ranging 


For each objective function coefficient, there is an upper and 
lower boundary range of values over which the optimal 
solution to the problem does not change. 


As the value of an objective coefficient changes, the optimal 
objective function value, the shadow prices, and the reduce 
costs will change, however the values of the optimal basic 
(used in solution) variables do not change. 


Objective ranging provides a sensitivity analysis of how the 
solution changes as we move past the bounds of the original 
optimal solution. However, to obtain the exact solution, the 
model! must be resolved. 
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Excel Solver Sensitivity Report 


Microsoft Excel 5.0 Sensitivity Report 
Worksheet: (94BOXERS.XLS]Boxers Base Case 
Report Created: 8/24/94 9:23 

















AC oc: cost and objective ranging 
Changing Cells" 
Final Reduced Objective Allowable Allowable 

Cell Name Value Cost Coefficient Increase Decrease 





5859 Boxer Production 500 $0.00 __ 3 0.6. 0.75 
$C$9 Brief Production 200 $0.00 45 _ 1.5 0.75 





44 ——— Shadow prices and right-hand-side ranging 


Constraints | 
Final Shadow Constraint Allowable Allowable 








Cell Name Value Price R.H. Side Increase Decrease 
5054 Yards Material 300 $5.00 300 uu _ 525 
$D$5 Logos 500 $0.00 600 1Е+30 100 
5056 Hours, Labor 900 $1.00 900 131.25 60 
5057 Demand Briefs 200 $0.00 375 1Е+30 175 





Linear Programming 
OPIM 101 ‘The ‘Wharton School 





Cell Name Value Price RH. Side Increase Decrease 





$D$4 Yards Material _ 300 $5.00 300 15 52.5 
$D$5 Logos 500 $0.00 600 1630 100 
$D$6 Hours Labor 900 $1.00 900 13125 60 
IT^ $D$7 Demand Briefs 200 $0.00 375 1Е+0 175 


| e How much would you pay for one additiona 


insignia logo? 
Nothing since logos are not constraining the 
solution! 
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ИРЕ Excel Solver Reports 


ө How much would you pay for one additional TT lo towmuthwenld vou s она ли T | logo? 
e Anew market survey shows that demand will double from 375 
to 750. How many additional briefs should be manufactured to 


meet this new demand? 


e А stain is found on 15 yards of material reducing material from 
300 to 285 yards. How does this affect total contribution? 

ө Labor is willing to negotiate 100 additional hours of production 
work. How much should management be willing to pay? 

e If labor offers 100 additional hours of production work at $1.50 
per hour should management be willing to accept the offer? 

в Product designers are considering offering а new line of 
padded briefs that require 1 yard of material and 2 hours of 
labor. Contribution would be $6.00 per padded brief. Should 
management begin this new line? 


_| Linear Programming Wharton 
OPIM 101 == The Wharton School 
| of the University of Pennsylvania 












Excel Solver Reports 


le If material decreases from 300 to 290 yards and labor increases 
from 900 to 1000 hours, what is the change in weekly 
contribution? 

ө A management consultant offers to improve efficiency in the 
production of boxers. The improvements will increase the 
contribution by $0.50 to $3.50. What is the new mix? What is 
the increase in weekly contribution? 


o The contribution for briefs decreases by $0.75 to $3.75. What is 
the new mix? What is the decrease in weekly contribution? 
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* Сгарһїс method is limited to two decision 
variables 


= More complex models use Simplex method 
* algebraic representation of extreme points (constraint corners) 


* algorithm proceeds from extreme point to adjacent 
extreme point (each move is called an iteration) 


* if unbounded, simplex method discovers this during execution 

* Fora maximization problem, simplex algorithm will generally 
increase for each iteration or decrease for each iteration of a 
minimization problem 

m Extremely large problems use Karmarker LP 

* by Narendra Karmarker of AT&T Bell Laboratories in 1984 

* Desert Storm logisitics - 500,000 variables & 70,000 constraints 





Linear Programming Wharton 
of the University of Pennsylvania 


ll The Dual in Linear Programming 


= Duality examines the value of resources | 
| corresponding to the constraints in the original 








or primal model (e.g. What is the value of 1 
additional hour of labor?) . 


ш There is one dual variable for each constraint. 


| ш The dual variable is zero for all non-binding 
| constraints. 

M m in Champion Sports, the dual LP minimizes use 
of hours labor, material, logos, and demand 

subject to the constraint that the value of the 

resources used is greater than or equal to its 
contribution margin 
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| Excel Formulation ој the Dual 


| | [hou сед 21 зато '" bb ево шовзе за] HT! 
(Champion Sports _____ 








=p — sss u 
ӨГ Material [Insignia Logos | Labor | Demand | MINIMIZE | — 


аа I ааа 
| |4| — Cost/unit|$5.00000 | $0.00000 |$1.00000 3000000 — | [Contrib 
== z mg ws 
L |]6 ^ Boxes — 040 — 100100] _ 0:00) ^ $300|--| $300 
E | 7 Briefs — 050 _ оо 200 _ 100 [...200| 100 $450 >= $4.50] 
WU РНН "NET EM > See мини ES P UNT 
SEE RPG NEIN. Bele! Rocca — Is 2 












Note formula іп F6 -ZSUMPRODUCT(SBS4:SES4,B6:E6) 
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= Answer report showing binding constraints 
* There are 4 binding constraints 


* Material and Labor that were binding in the primal are not 
binding in the dual 








Name ` Cell Value — Formula — Status Slack 


$0.00000 $E$4>=0 
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ln Excel Formulation of the Dual 


hanging Cells I 
Final Reduced Objective Allowable Allowable 
Cell AA Name Value Cost Coefficient Increase Decrease 
| 5854 Material _ $5000 __ 0.00 _ 300 15.00 52.50 


$C$4 Insignia_Lo $0.00000 100.00 600 1E+30 _ 100 
$D$4 Labor — $1.00000 0.00 900 131.25 60.00 
$E$4 Demand $0.00000 175.00 375 1Е+30 _ 175 





Final Shadow Constraint Allowable Allowable 





Cell | Name Value Price R.H. Side Increase | Decrease 
SFS6 Boxer contribution — $3.00 &— ^ 500.00 — $3.00 &— à $0.60 — 5075 
$F$7 Brief contribution $4.50 200.00 $4.50 $1.50 $0.75 


Sensitivity report showing shadow prices and 
reduced costs for the dual LP 


| Linear Programming "n" Wharton _ 
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Interpretations of Duality 


ш Resource “rents” must be at least as much 
from producing. 
( $3.00 x Boxers ) + ( $4.50 x Briefs ) 
MUST BE LESS THAN OR EQUAL TO 
300 x Material + 600 x Logos + 900 x Hours + 375 x Demand 


ш Optimal value is always the same in both the 
primal and its dual ($2,400 in this example) 
m Simplex method solves the primal and dual 
simultaneously 
• Values of shadow prices in one optimal solution (primal or dual) 


are always equal to the value of the structural variables in the 
other optimal solution (primal or dual). 


* Thus, all information from the dual is available from the primal 
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Application: Red Brand Canners 


Contract price for the 3 million pound tomato crop was $.06 per pound. 
80% of the crop was grade В; 20% of the crop was grade A. 









There was demand for 14.4 million pounds whole tomatoes, 1 million 
pounds juice tomatoes, and 2 million pounds paste tomatoes. 

A quality scale is used to rate tomatoes on a 0 - 10 point scale (Ten is 
best). Grade A averages 9 points per pound; grade B averages 5 
points per pound. The minimum grade quality for whole tomatoes is 8 
points, and the minimum grade quality for juice is 6 points. 

The margin for whole, juice, and paste tomatoes was $1.48, $1.32, and 
$1.85 per case with 18, 20, and 25 pounds of tomatoes per case, 
respectively. 

What mix will maximize contributions at Red Brand Canners? 





Linear Programming rton 
OPIM 101 | The Wharton School 
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Application: Red Brand Canners 













= Define the 6 decision variables: 
ш Whole grade A 

ш Whole grade В 

m Juice grade A 

m Juice grade B 

m Paste grade A 

m Paster grade B 

m What are the non-negativity constraints? 
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tts ttc _ Application: Red Brand Canners 


| | | беса N msma MN T T ТИ are the constraints for 

supply of tomatoes? 

Let WA = whole grade A 
WB = whole grade B 
JA = juice grade A 


JB = juice grade B 
PA z paste grade A 
PB = paste grade B 
Contract price for the 3 million 
pound tomato crop was $.06 
per pound. 80% of the crop 
was grade B; 20% of the crop 
was grade A. 
| Linear Programming 
OPIM 101 



















What are the constraints for 
demand of tomatoes? 
Let WA = whole grade A 
WB z whole grade B 
JA z juice grade A 
JB z juice grade B 
PA = paste grade A 
PB = paste grade B 
There was demand for 14.4 
million pounds whole tomatoes, 
1 million pounds juice 
tomatoes, and 2 million pounds 
paste tomatoes. 
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Application: Red Brand Canners 


What are the constraints for 
quality of tomatoes? 




















A quality scale is used to rate 
tomatoes on a 0 - 10 point 
scale (Ten is best). Grade A 
averages 9 points per pound; 
grade B averages 5 points per 
pound. The minimum grade 
quality for whole tomatoes is 8 
points, and the minimum grade 
quality for juice is 6 points. 


| Linear Programming | | 51) 


What is the objective function? 
Let МА = whole grade A 

WB z whole grade B 

JA = juice grade A 

JB = juice grade B 

PA = paste grade A 

PB = paste grade B 
The margin for whole, juice, 
and paste tomatoes was $1.48, 
$1.32, and $1.85 per case with 
18, 20, and 25 pounds of 
tomatoes per case, 
respectively. 
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Neural Networks 


ш Analogy between neural nets 
and the nervous system 


m History of neural networks 

ш How neural nets work 

ш Example problem 

= Common questions about neural networks 
ш Application examples 

ш Selected references 

ш Summary 


WHARTON REPROGRAPHICS. 
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Analogy between neural nets 
and the nervous system 














= Neural nets based on nodes and connections 
Analogous to a nerve cell - 1012 neurons and 10% 
synaptic connections in the human brain 


= Nodes have input signals 
Dendrites carry an impulse to the neuron 


= Nodes have one output signal 
Axons carry signal out of neuron and synapses are 
local regions where signals are transmitted from the 
axon of one neuron to dendrites of another. 


m Input signal weights are summed at each node 
Nerve impulses are binary; they are “go” or “no go”. 
Neurons sum up the incoming signal and fire if a 
threshold value is reached. 
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How Neural Nets Work 






m Implementation 
• Hardware - electronic circuits mimic neurons 
* Software - linkages of nodes, inputs, and outputs can be 
programmed 
m Uses a trial and error method of learning 


* Finds patterns associating inputs and outputs using a large set of 
training data where both inputs and outputs are known (e.g. use 
the intermarket relationship among the Standard & Poor's 500 
index, 30-year Treasury bonds, and the commodity research 
bureau index to predict direction of the S&P 500 index trend 5 
weeks into the future) 


° Initially begins with random weights and corrects mistakes by 
modifying the weight that it has given each input item. 


Neural Networks | Wharton 
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||| How Neural Nets Work 


98 | m Feedback network 


• A given поде 5 output can be transmitted back to itself or 
to other previous nodes as another input 

| Neural Networks _ | WU | 1 

OPIM 101 | 


m Feedforward network 
* All outputs only go forward 


m Parallel distributed processing 
versus serial symbolic processing 





Neural Networks 





Page 2 


gp Neural Network Layers _ | | 
| T^T TRE ЕЕЕ “а.а Output 
II Layer , : Layers Layer 
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How Neural Nets Work: 


= Tradeoff between training speed and weight 
quality 
* if too fast, weights may not be effective for new data 
* if too slow, network may “memorize” the data and not predict well 
for new data 
ш Models and rules for learning are based in 
biology and psychology 
* Hebb's rule - changes in synaptic strengths are proportional to 
neuron activation (Hebb 1949). Basis for neural nets. 
• Grossberg learning - self-training and self-organization allow net 
to adapt to changes in input data over time 
(Grossberg 1982) 
• Kohonen's learning law - two-layer network with content 
addressable associative memory! for unsupervised 
learning (Kohonen 1984) 
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How Neural Nets Work: 
x Unsupervised Learning 


m Nets are self-learning 
* BAM (bi-directional associative memory) used for OCR, 
speller checker, voice recognition 


* Weight adjustments are not from comparison with known 
values 


* Based on the input pattern, only weights for the winning node 
or a few nodes are modified 


Wi =A;A j where: 
Aj is the activation of the ith node in one layer 


А j is the activation of the jth node in another layer 
Wij 1s the connection strength between two nodes 
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How Neural Nets Work: 
Supervised J | Supervised Learning — — — — 


О | [Ти Gradually train weights to meet desired outputs => ly train weights to meet desired outputs 
* inputs presented to the network 
* weights adjusted to achieve desired output for training data 





* corrections based on difference between actual and desired 
output which is computed for each training cycle 


* if average error is within tolerance- stop, else continue training 
* weights are locked in and the network is ready to use 


A Wi; =0 А; (Cj - Bj) where © is the learning rate, 
Aj is the activation of the ith node in one layer: 





Bj is actual activation of the jth node in recalled pattern, 


Cj is desired activation of the jth node, and 
Wij is the connection strength between two nodes 
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|| Mathematical Model of a Node 


= ge бй eer P unction x — Ха Wi 






incoming О О 
=: О Ow 


Ww 
activation ! 





Threshold. Function 


| — 11 if x>0 
i fe9- 0 if an 
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How Neural Nets Work 
Back Propagation 


(D Input is presented to net and output is produced 
@ Compute differences between actual and desired outputs 
@ Adjust output layer weights using discrepancies between 
desired outputs and actual outputs 
Then adjust hidden layer weights (if there is a hidden 
layer) 
(9 Then adjust input layer weights 
© Repeat steps 1 - 5 until desired accuracy level is achieved 
m Advantage: 
* ability to learn any arbitrarily complex nonlinear mapping 
m Disadvantages: 
° extremely long - potentially infinite - learning times 
• Speed up using parallel hardware 








| Neural Networks __ Wharton 
OPIM 101 | ‘The Wharton School 
of the University of Pennsylvania 


Neural Networks 





Page 5 


393 


394 






Neural Network Layers: 
Back Propagation 






QA 

NU 

| 4. 
V 5 
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Common Questions About 
Neural Networks 


| = What is a hidden layer? 
• Group of nodes between the input and output layer 
* Hidden layers increase the ability of the network 
to "memorize" the data 
m How many hidden layers should | use? 


* As problem complexity increases, number of hidden layers should 
also increase 

* Start with none. Add hidden layers one at a time if training or 
testing results do not achieve target accuracy levels 


m What is a hidden node? 


* Anode in a hidden layer is called a hidden node 


* À hidden node contains much of the knowledge in the network 
and act as filters to remove noise moving through the network 
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| Common Questions About 
Neural Networks 


m How many nodes and hidden nodes should | use? 

* Ideally, you will have between 2 and 3 training cases per connection 
(synapse). Fewer training cases per connection will cause 
problems generalizing the test data. 

* For example, suppose you have 60 inputs, 240 training cases and 
one hidden layer with 5 hidden nodes. A fully connected network 
would have 60 x 5 + 5 connections (305). This is less than one 
training case per connection (240/305). 

* Correct by decreasing inputs to between 15 and 23 
(determine which 15 or 23 by trial and error) 

— 15x 5 + 5 connections (80) for a 3:1 ratio (240/80) 
— 23 x 5 + 5 connections (120) for a 2:1 ratio (240/120) 

* Or correct by reducing the number of hidden nodes to 2. A fully 
connected network would have 60 x 2 + 2 connections (122). This 
is almost 2 training cases per connection (240/122). 
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Common Questions About 
Neural Networks 





m How do! know if network modifications are 
needed? 
* Low accuracy of training or test data indicates that a new hidden 
layer or more hidden nodes are needed 
— if number of hidden nodes exceeds number of inputs and 
| outputs, then add another hidden layer 
| — decrease the total hidden nodes by 50% їп each successive 
hidden layer [if 10 nodes in first layer, then use 5 in the 
second layer and 2 in the third layer | 
* |f Braincel performs well on the Training and Test ranges, but 
poorly on new records, then it is treating each record as a special 
case and has “memorized” the data 
— use fewer hidden nodes or remove the hidden layer 


* Could also need more training cases per connection 


Neural Networks Wharton 
OPIM 101 "Rhe Wharton Skool 


Neural Networks Page 7 


395 


396 


| Application Examples: 
Finance and Li. Finance апа Banking — — — 


m F ТРЧИ аа da SS Eee ^ failure prediction (Koster, Sondak, & Bourbia 
1991; Wilson & Sharda 1993) 
m Bank failure prediction (Cinar & Lash 1992; Tam & 
King 1992) 
m Bond rating (Utans & Moody 1991) 


m Mortgage credit approval (Reilly et al. 1990) 


m Credit card fraud prevention at Chase Manhattan 
Bank, American Express, and Mellon Bank examine unusual 
credit-charge patterns over a history of usage and compute a 
fraud potential rating. [ For example, the Fraud Detection 
System by Nestor Corp. and a system by HNC Inc. 





(Rochester 1990) ]. 
m Takeover target prediction (Sen & Gibbs 1992) 
Neural Networks Wharton 
OPIM101 | The Wharton School 
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Application Examples: 
Finance and Banking 








m Country risk rating for early warning of 
financial risk (Roy and Cosset 1990) 

m Stock price prediction (Fishman, Barr, & Loick 1991; 
Yoon & Stein 1991) 

= Commodity, futures, and currency trading at 
Merrill Lynch, Salomon Brothers, Shearson 
Lehman Brothers, & the World Bank. Citibank 
claims 25% returns in currency trading using 
GA trained neural nets (Business Week March 2, 

| 1992) 
m Asset allocation (Steiger & Sharda 1991) 


ш Corporate merger prediction (Sen, Oliver, & Sen 
1992) 
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Application Examples: 
Manufacturing 





ш Quality control 

ш Predict tool breakage in milling operations 

ш Force and / or wear analysis 

m Mechanical equipment fault diagnosis 

m Process management and control - maintain 
efficiency of electric arc furnaces in steel- 
making; uniformity in pulp & paper process 
management 
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Application Examples: 
Marketing 


ш Customer mailing list management (Hall 1992) 
| m Spiegel Inc. mail order catalog targets saved 
$1 million from reduced costs and increased 
sales (Business Week March 2, 1992) 

m Airline seating allocation and passenger 
| demand for Nationair Canada and US Air (IEEE 

Expert Dec 1992) 

m Customer purchasing behavior and 
merchandising-mix strategies 
. m Hotel room pricing - yield management (Relihan, 





W. 1989) 
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OPIM 101 Зе Wharton Solent 
of the University of Pennsylvania 


Neural Networks 





Page 9 


397 


9 enaq 


¿iska me Y Ve senna 2 5 





оппо sdii&uD a 

води втадо gnillim ni epnsid loot 191590019 = 

пита тези 70 bire egiod p 

| eteonpein па inemoiupe isoinsrisell ш 

nininism -jorinoo brie j)arepenen e2s20y8 a 
Jesle ги ¿entri эзе shinele to уатиноте 
Senor 6080 & qluq ni уто ти режет 


Lou а аа ШОИ ипи. 
(me 


29V was: sorde / 
Sri 


Е = on - А „а = = 


‘Mtr tet) Soemepansm јан рг өтө тетеје а 
byvoeg пети ро?аіво veb:9 lir orn iepsiqc u 
yeas toni bna а$3аоо эшо» тон noilla re 

(Seat £ nami aos’ злетнешау ajne 

Yogpnoasanq bre neiasolls ргијвег anihis. = 
121) WA BU bas sben2 *Ionoifal! so! битеп 
(565 | oc еда 

bne toivrrled реда :moleuo & 
eelpsrette xim-p: izl»nendove rt 

ete) Tnemagencm biel!» рота moo |ојон e 
(eget 'л 














13A10wTSM (wai 


ИРЕН Genetic Algorithms 


E: | | Ta akakae reque botsten MAET T TI size 
m What are genetic algorithms? 
m Genetic algorithm components 


ш Example problem 
ш Common questions about genetic algorithms 


m Example applications 
m important references for genetic algorithms 


ш Ten summary points 


| Wharton 
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ш If we were to look at every alternative, what would 
we have to do? Of course, it depends. — 


im Think: enzymes 
* Catalyze all reactions in the cell 
* Biological enzymes are composed of amino acids 
* There are 20 naturally-occurring amino acids 
* Easily, enzymes are 1000 amino acids long 
* 20^1000 = (2^1000)(10^1000) = 10^1300 


ш A reference number, a benchmark: 
10^80 « number of atomic particles in universe 
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Heuristic Search 
















ш Build rule-based expert systems 
* Performance so far not super impressive (somewhat impressive) 


* Doesn't show what's needed. Only shows that there exist such 
rules, not how they are found or how cognition could work. (rule- 
governed vs rule-described) 


* Expensive and very time-consuming in general 


m Build programs that acquire rules automatically 
* Genetic algorithms 
* Performance so far is very impressive (e.g., suspect ID) 
* Still time-consuming, but can hope for a general architecture 
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ш Selection 

* determines how many and which individuals breed 

* premature convergence sacrifices solution quality for speed 
ш Crossover 

* select a random crossover point 

* successfully exchange substructures 

* 00000 x 11111 at point 2 yields 00111 and 11000 


* random changes in the genetic material (bit pattern) 
* for problems with billions of local optima, mutations help find the 
global optimum solution 


m Evaluator function 
* rank fitness of each individual in the population 


* simple function (maximum) or complex function 
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Genetic Algorithms: 
Example Problem 
_ [Annual sales for Avoiding Extinction by JWI Publishers _ 
is 20,000 copies. Books are sold for $30. 


JWI Publishers have a variable cost of $6 per book 
associated with producing the book. 


JWI Publishers have two fixed cost components. 





= Overhead, royalties and other costs total $350,000. 


E 


Setup cost per printing is $6,000. Thus, 4 quarterly 
printing would cost 4 x $6,000 = $24 000. 





_| Genetic Algorithms Wharton 
OPIM 101 mm | The Wharton School — 
of tha University of Ermana 


Genetic Algorithms 


Page 3 


401 


402 


Genetic Algorithms: ` 
Example Problem 
Annual sales for Avoiding Extinction by JWI Publishers 
is 20,000 copies. Books are sold for $30. 


JWI Publishers have a variable cost of $6 per book 
associated with producing the book. 

JWI Publishers have two fixed cost components. 
Overhead, royalties and other costs total $350,000. 
Setup cost per printing is $6,000. Thus, 4 quarterly 
printing would cost 4 x $6,000 = $24,000. 


EOQ model yields | 
N _ {2PU _ |[2*$6,000* 20,000 
z "unit | C ОА $6 


N = 6,325books/ order 





unit | 
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| Genetic Algorithms: 
Example Problem 










У 

Annual Book Sales 
Number of Setups 3.16 
Setup Cost $6,000 

Selling Price $30 
Variable Book Costs | 











Variable Costs — $18,978 | 
Fixed Costs $18.969 
Other Costs $350,000 
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| Genetic Algorithms: 
| | Example Problem — — Problem 


@ Choose the T T Té Choose the problem représentation «^61 | |, representation 
* 14 digits binary string allows an order size from 1 to 16,384 (27^) 
® Initialize the population 


* randomly generate 100 - 200 individual strings of length 14 


© Calculate fitness for each individual 
* convert string to decimal and determine profit with that order size 
* 00100010011010 = 2,202 


Total Revenue | $600.000 
| | Varia! ria Te costs | costs МАЊА 212 | 


etup costs; 20,000/ 2,202 


8 
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| Genetic Algorithms: 
Example Problem 


| @ Perform selection 
* long run survival of the fittest 
- short run merely nudges population towards better performers 
* replace the worst strings ( bottom 5%) with copies of the best 
strings ( top 5% ), thus it would take a minimum of 20 generations 
before all strings are replaced - slow convergence. 
© Perform crossover 
* randomly select two parents from the new population 
• randomly determine whether to crossover (р = .6) 
* if crossover, randomly select a crossover point (1-13) 
* example: 
0010001001 1010 (2,202) x 11011001000111 (13,895) 
at 3 yields 
1100001001 1010 (3,655) x 00111001000111 (12,442) 
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Genetic Algorithms: 
Example Proble 





© Perform mutation 
* bit by bit, string by string, randomly determine whether to mutate 
each bit using a very low probability (p=.007). If mutation rate is 
too high, it will prevent convergence. 
* if mutation should occur, change 0 to 1 or vice versa. 
© Check convergence 
* bias is one measure of agreement among the population 
* bias assumes values between 50 and 100 percent 
* bit bias 
— if 100 strings have 0 in position 1 and 100 have a 1, then 
the bit bias is 50% 
— а 75:25 split or a 25:75 split has a 75% bias 
— a 90:10 split or a 10:90 split has a 90% bias 
* string bias is the average bias for each bit over all strings 
* a population with a average bias of 95% has converged 
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Common Questions about 
| Genetic Algorithms 


m Can a GA converge to a poor solution? 


YES! Poor problem representation, premature convergence, a poor 
fitness evaluation algorithm, or luck of the random numbers could 
generate a poor solution 


m How do you know whether the GA solution is 
optimal or near optimal? 


If you knew how to find the optimal solution, you would not need to 
use a GA. There is no guarantee that a GA will find an optimal 
solution. GAs find a good solution that is "better" than others. 


m Are neural networks better than GAs? 


Neural networks require less structural knowledge. However, the 
type and number of node connections and hidden layers make it 
difficult to interpret relationships in a neural network. 

GAs require a starting framework to setup the problem 
representation and calculate fitness 
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| Important References for 


| Genetic Algorithms 
i m Holland, J.H. 1975. рен из айша! ad 





artificial systems. Ann Arbor, MI: The University 


of Michigan Press. 
* classic technical book with lots of theorems and Lu di 


= Sarees: D. E. 1989. 





Reading, MA: Addison-Wesley. 
* graduate textbook for a machine learning course - code in Pascal 
m Davis, L. (ed) 1991. Handbook of genetic 
algorithms. New York: Van Nostrand Reinhold. 
* tutorial and case applications with code in C or Lisp 
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Important References for 
Genetic Agorias 





. магад of programs as ui rather than 1s and 0s 


m Bauer, R.J. Jr. 1994. Genetic algorithms and 
investment strategies. New York: Wiley. 


* examples of GAs used for trading bonds and stocks 


ш Karjalainen, R. and Allen F. 1994. Using genetic 





laorithms to find techni | 
Wharton working paper 
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| Genetic Algorithms: Summary 


| Field is not new. Holland's work began in 1970s. 
2 Most of the work has been done in computer 
science & engineering - not business applications 


@ Translate problem into a string representation - 
often binary numbers (11000) 


Difficult to perform translation for some problems 


© Little knowledge at startup - randomly generated 
population of individuals 





2 
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Genetic Algorithms: Summary 


© Must be able to calculate fitness of each 
individual in the population 

@ Potential solutions with greater fitness have 
greater priority in subsequent generations 

Crossover is similar to mating and mutation 
transforms a stable population to maintain 
diversity of the search process 

Steps 6-8 repeat. Eventually the population 
converges and the fittest solution survives 

@ GAs are not an optimization technique but often 
find good solutions for large complex problems 






| | Genetic Algorithms Wharton 


OPIM 101 
of the University of Pennryfvania 


Genetic Algorithms Page 8 


407 





Functions of a Database System 








ш Store and organize data efficiently 

ш Create and maintain the database 

= Provide information to users 

m Provide tools to meet changing information 
requirements 

= Protect the database against inconsistency 
and errors 


ш Safeguard the database against unauthorized 
access 
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| Data Representation 





ш Bit binary digit 1 or 0 
= Byte 1 alpha-numeric character 


m Field or column customer no., name, 
address, order number 


m Record or row all customer fields in 1 row 

m File or data set all customers 

m Data base customers, sales staff, 
inventory 

m File processing 

ш Database management 
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| |ы мм! | Hierarchical 
w I | | = Relationships defined at database croaion defi ned at database creation 
| m Advantages 
* two times faster than relational 


* simple to understand and model 


m Disadvantages 
* can not pose ad hoc queries 
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Network | 





m Subordinate records linked to multiple parents 
m 1:1, 1:M, and M:N relationships 
m Complex and difficult to apply 


Courses 






Student ID Name > Grade — 
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| Il Relational 
ш Properties Flight. From — To: : Da m 1 
* each row is a record 95126 ago New York: — 730 830. 
• each column is afield T emt p EC Pm ES 





. и 
• best for ad hoc queries 


* can represent any 
hierarchical or network 
database 


ш Disadvantage 
* Slow 
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. Ghiscta ai are specific entities 
ш Classes are program defined groups of objects 


| m Inheritance of all or some attributes or 
operations (e.g. full-time vs part-time employee) 





| == = 
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| Database Systems 
m Data integrity - accuracy, correctness, & validity of data 
| * Safeguard data against invalid alteration or destruction 


* Concurrent user access with record locking prevents 
one user from accessing a record while another is 
updating the same record. 


• Avoid redundant data 


m Data independence 


* physical - modify physical scheme without need to 
rewrite application programs 


* logical - modify conceptual scheme without causing 
application programs to be rewritten 
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Relational Database Terminology 





| m Relation: a table of tuples and attributes 

m Tuple: corresponds to a row of a table 

m Attribute: corresponds to a column or field 

m Cardinality: the number of tuples in a table 

= Degree: the number of attributes in a table 

ш Primary Key: unique identifier for every 
record in the table - no null values! 

m Candidate Key: any key that can serve as a 
primary key 
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Relational Database Terminology 








= Composite Primary Key: combination of 
primary keys to make a unique identifier 

m Foreign Key: attribute in one table that is a 
primary key in another table 

m Nonkey attribute: any attribute that is not 
part of the primary key 
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) Am Cii 






Banking Inman 5223 A 
B 


FIN 207 Sec Anal 





m Course number is a repeating group! 


Database Systems harton 
OPIM 101 ‘The Wharton School 
of the University of Pennsylvania 


Database Systems 









Page 6 


| Normalization 


ш Basically normalization eliminates repeating 
groups to avoid redundancy in all tables 





ш Principal normal forms: 

* 1NF -all primary keys are defined, all nonkey attributes 
depend on the primary key, and no repeating groups 

• 2NF - no partial dependencies (only possible for tables 
with a composite primary key) 

* 3NF - no transitive dependencies 

* BC (Boyce/Codd) - equivalent to 3NF unless multiple 
candidate keys, composite keys, and candidate keys 
have at least one attribute in common. 
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| The Registrar Database - INF 
i T Yala Course Instructor Instructor K 
| Student # Das adn e | Mame mon nd | 








Sys An 


MGT 104 HRM 
| orp Fin 


FIN 204 Banking 
FIN 207 SecAnal Herring 


N udent | 
Student# Name Major | m No repeating groups! 1NF 


Brig | Student # Course # 
69173 smith MGT | ш Student #15 3NF 
42356 WU FIN 
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a The Registrar Database - INF 


Е | | | u insertion Anomaly Anomaly | 
* Insert OPIM 101, Introduction to Computers with a new 
course number and new course title 


* One student must register for OPIM 101 to be able to insert! 
* Same problem with adding an instructor! 
= Update Anomaly 
* Change title from Sys Anal to Sys Anal & Des 
* User must search through all tuples and update course 
each time it occurs! 
m Deletion Anomaly 


* If one student is enrolled in an independent study and 
drops the course, we also lose information about the title 
and the instructor! 
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Banking 
Sec Anal 









Student 4 Name 





38 Bright 
69173 Smith 
42356 wu FIN 
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gape The Registrar Database - 2NF 


äl Галка о.  . ТГ Апота!у 
*insert new instructor 
*Can not insert instructor without assigning her a course! 


m Update Anomaly 
*Each instructor can teach multiple courses 


*User must search through all tuples and update instructor 
data each time it occurs! 


m Deletion Anomaly 


*if one course is deleted, we also lose information about the 
instructor! 
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E The = === Database - SNF 
| | | 1 т Hed 7 | agis 
ІН Ат Сімі 
| MAM | MGT 104 | 










nstructor Instructor 
Name Location 






















| Jones L3091 
| Staf B320 






Banking 
Sec Anal 


69173 
42356 WU 
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|| _The Registrar Database - 3NF 


| ш 2NF anomalies concerning insertion, deletion, 
and updating are removed. 





* There may still be anomalies when a reiation has multiple 
primary keys 
* Boyce-Codd normal form 
m No information is lost during the 
normalization process 


m Redundant information is reduced 
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The Registrar Database - 
Boyce-Codd normal form 
3NF ` 
Student 
123 
|123 - 
[456 


















Major Advisor 
Physics Bohr 
Music Mozart 
Biology Darwin 
Physics Bohr 

Physics Einstein 
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The Registrar Database - 
Boyce-Codd normal form 


Student Advisor | Advisor Major 


Sohr Sohnr Physics | 
Darwin Biology 
| Einstein Physics 















Einstein 


m Equivalent to 3NF if the primary key is not a 
composite 
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The Parts-Supplier Database 
Suppliers 





Why two suppliers with SNAME of Smith? 

What fields are candidate keys for this relation? 
What does STATUS mean? How would you find out? 
Ordering on the rows? 
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PNAME COLOR WEIGHT CITY 


P4 Screw Red 14 London 
P6 Cog Red 19 London 
P2 Bolt Green 17 Paris 
P5 Cam Blue 12 Paris 
P3 Screw Blue 17 Rome 





* Does CITY in P mean the same as CITY in S? 
— Suppliers are located in s.city 
— Parts are stored in p.city 
— If P3 was not stored in Rome, does that mean there is not 
a supplier in Rome? 
* 17 what? Pounds? Ounces? Tons? Kilograms? 
* How would you find all parts supplied in Paris? 


. | Database Systems | Wharton 


OPIM 104 The Wharton School 
of the University of Pennsylvania 





* Double key, S#-P# 
* Why more than one table in the Parts-Supplier database? 
* How do we pose queries that rely on data in 

more than one table? 
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Structured Query Language - SQL 


m General form of SELECT statement in Access: 


SELECT DISTINCTROW «attributes to be displayed» 
FROM «tables» 

WHERE «conditions are met^ 

ORDERED BY «attribute» [DESC] 


* Use of square brackets, 
e.g., P.[P#] for the P# field of table P 


* DISTINCTROW avoids duplicate records 
* ALL means duplicate records are not removed 
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Structured Query Language 
Example I 


| SELECT DISTINCTROW P.COLOR, P.CITY 
FROM P 

WHERE P.CITY <> “Paris” 

AND P.WEIGHT > 10; 


ш What does this say? 

= Note: ORDERED BY is optional and here absent 

= Returns a table listing the color and city of the 
part for all parts in the relation P where the city 
is not equal to "Paris" and the part weight is 
greater than 10 





Database Systems Page 13 


420 


| Strategic Applications 


ш Quality control at Whirlpool Corporation 
m Commercial marine paint sales 
m American Airlines Saber reservation system 


m Rosenbluth Travel 





m American Hospital Supplies 
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The Information Retrieval Problem 


m The IR problem is very hard 


m Why? Many reasons, including: 
* Documents are not (very) structured 
— Database searches vs document base searches 
* Language is not (very) cooperative 
— DNA: microbiology or DEC Network Architecture? 
— Free rider: game theory or urban transportation systems? 
— Corporate memory or organizational memory? 
m Physical access vs logical access 
* Physical: relatively easy 
* Logical: terribly difficult 
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The Information Retrieval Problem: 
Basic IR Technology 


ш Your basic IR technology 
* Full text or keyword retrieval, with 
• Boolean combinations and 
• Location indicators 

m Full text--has everything 
* Or does it? 

m Keyword indexing 
* Requires work 

m Boolean combination of words 
* Usual Boolean operators: AND, OR, NOT 
* This is a logically complete set 
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The Information Retrieval Problem: 
Probability of Retrieving a Relevant Document 
















P(word,)=.6 probability searcher uses word, in a query 
| P(word,)=.5 probability searcher uses word; іп a query 
| P(Doc word,) *.7 probability word, is in relevant document 
P(Doc word;)-.6 probability word, is in relevant document 


The probability of searcher using word, in a query and word, being 
in a relevant document is P(word,) x P(Doc word,) 2.6 x .7 = .42 


The probability of searcher using word, in a query and word, being 
in a relevant document is P(word;) x P(Doc, word;) = .5 x .6 = .30 


The probability of searcher using word, and word, in a query and 
both word, and word, being in a relevant document is P(word,) x 
P(Doc_word,) x P(word,) x P(Doc word;) = .6 x .7 x .5 x .6 = .126 
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The Information Retrieval Problem: 
Basic IR Technology 


elevant Not 
________послијеућаме 1 
| Retrieved Х | ü ~ | Total Number 
"yal Recs cos oa 
| Not V Y | d 


Relevant = n, 





Recall measures how well ali relevant documents 
are retrieved ( x / п, ) 

Precision measures how well only relevant documents 
are retrieved ( x / n,) 
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The Information Retrieval Problem: 
| Basic IR Technology 


not relevant 
not retrieved 





п When and where and how does the 
recall vs precision distinction matter? 
m How well does full text retrieval work? 
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|| The Information Retrieval Problem: 
| | Summary of Blair and Maron Study — of Blair and Maron Study 


É | T Та Searcher perception that their search маѕ ^ — perception that their search was 
exhaustive (recall > 75%) actual recall 20% 


m No significant difference between searching 
ability of lawyer or paralegal 


m Searchers were only able to anticipate a small 
number of words and phrases that could be used 
to retrieve relevant documents and would not be 
in irrelevant documents 


m Extraordinary and unpredictable variability in the 
words and phrases used to discuss the same 
topics (e.g., the accident in the litigation referred to as situation, 
difficulty, event, what happened last week, and we all know why we 
are here ) 
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The Information Retrieval Problem: 
Why is IR such a difficult problem? _ 


m Zipfian word distributions (plot of index 
words by rank gives a hyperbolic shape with 
long tails) 

m Scale is the problem 

ш Concept: futility point(s) 

m Demise of the library model 

ш Collection partitioning 

m IR as communication 

ш Importance of context 
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= Program to execute repetitive 
ш Perform special calculations 


= Create a custom user interface 


WHARTON REPROGRAPHICS 
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ees with Visual Basic 


BOOK1.XLS 


[тете Commissions (Shares Sold, PricePerShare) 
TotalSalePrice = SharesSold * PricePerShare 
f TotalSalePrice <= 15000 Then 
Commissions = 25+ .03 * SharesSold 
Else 
Commissions = 25+ .03 * (.9* SharesSold) 
| End lí 
| End Function 
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|| Каз | with d Ll 


| ES XS 
JI | Commissions(C3 Са). C 


- aam qur ng rm rm т Tp re ата run 


/— И 





——wnuia rr u s -=+Ykaacn-Ç aa. rn 


| Auer d Shuwa, EN AEN! 
Sale Price = $55 | 





Commission $28 | | нян 


=... srrrnss[Ñ assasi teaser шы шыш 
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Programming with Visual Basic 















m Macro Advantages m Macro Disadvantages 


1. Faster operations 1. Development time 

2. Reduce human 2. Maintenance and support 
involvement in repetitive work over time 
tasks 3. Uses macrosheet and 

3. Fewer errors worksheet 

4. Fewer keystrokes per 4. Forget to update information 
operation using macro 

5. Enhanced features and 5. Macrosheet must be open to 
interface options use the macro 


6. Wider calculation options 
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Alternatives to Macros 
m 1. Templates 
ш 2. Command shortcut keys 
m 3. Using style sheets in addition to templates 


ш 4. Use worksheet parameters 


Programming 


OPIM 101 E ‘The Wharton School 
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Two | | Two Types ОЈ Macros Of Macros 
ш Command macros 
* execute menu commands 
* programmed using the recorder 
* user initiates the action - Ctri-b 
m Function macros 
* programmed =average( ) 
* e.g. formula paste functions 
* functions for special calculations 
* worksheet formula initiates macro 
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Local versus Global Macros 


m Local macros 
| • module MUST be open 
= Global macros 









• stored on a hidden macro sheet called global.xlm 
• automatically loaded with Excel 
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Macro Examples 





= Command macro for formatting borders 

m Function macro for converting temperature 
m Adding a new menu bar 

m interface for marketing forecasts 


ш Database for student class registration 
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|| | алатима шила Distinct Parts of Function Macros 


ill = Workbook name (if in a different workbook) 
ш Macrosheet name 


ш Separate macrosheet & function with ! 
ш Name of function 


m List of one or more function arguments 
. example: [workbook.xls]demo!FtoC2(B4) 


| | Programming | Wharton 
OPIM 101 The Wharton School 
of the University of Pennsyloania 






Debugging Visual Basic Code 

ш Syntax errors ~ | 
• Misspelled or missing keywords and punctuation marks 
* Most are caught as you type (Intteger vs. Integer) 

= Compile errors 


* Syntax errors found only when you run the program 
.. Then statement without an End If 


* Generates an error message 
ш Runtime errors 

* Errors found only when you run the program __ 

* Division by zero or using a property with wrong object 
ш Logic errors 

* No error message - output is not what you expected 

* Use trace procedures - very difficult to debug 









| | Programming Wharton 
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poma Debugging Visual Basic Code 


| | [[mSettingabreakpointinVisualBasic = ^ a breakpoint i in Visual Basic 
* F9 will toggle a breakpoint 
* F5 will resume normal program execution to the next breakpoint 


m Stepping through a procedure 
* F8 will manually control line-by-line execution 
* Use Shift F8 to step over a procedure in a macro 

m Adding a watch expression while in break mode 
° View intermediate values of counters in loops and other variables 
* Useful for finding logic errors 


| | Programming | — Wharton 
OPIM 101 | The "Wharton School 
of the University of Pennsylvania 










Debugging Tips 


m indent your code for readability 
* easier to trace and decipher indented code 
* indented coded enhances code documentation 
m Turn on syntax checking 
* Tools | Options 
* General module tab - check Display Syntax errors 
m Require variable declarations (Option Explicit) 
m Break down complex procedures into 
many small chunks 
m Enter all macros using lowercase 


* if Visual Basic does not change a keyword to normal case, 
then you typed the keyword incorrectly 















Programming 
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Make sure the module containing the local macro is open 


* If you are trying to run the macro using a shortcut key, be certain 
the shortcut key has been defined 


* Check to see whether multiple macros were assigned the same 
shortcut key. If so, change one of the procedures 


* Make sure that another open module doesn't have a procedure with 
the same name 


m Use comments to temporarily deactivate parts of 
code that are giving you problems 


m Do not use reserved words and worksheet names 
for macro 


= Use range names -percent raise rather than =B7 


|_| Programming Wharton 


OPIM 101 The Wharton School 


of the University of Pennsylvania 
|| Debugging Tips 


|m Use lowercase "variable" names to tell variables 
from reserved keywords 
m Command macros do not need a return value 
* difficult to debug complex formulas and procedures 








m Function macros must return an argument 
m Break up long statements 


* reduce to smaller chunks to facilitate isolation of the problem 
m Use user-defined constants 
* if constants are always constant, declaring values as constants 
— prevents values from changing 
— makes the code easier to understand 
— prevents you from using the wrong value in a formula 


Programming 
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ш What is simulation? 
= Selected applications of simulation 

ш Comparing analytic solutions with simulation 

= Advantages and disadvantages of simulation 

m Types of simulation 

= Random number generation , 
m Discrete event simulation logic 

ш Statistical analysis of simulation output 

ш Simulation languages and animation 
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| One model of reality at varying levels of detail 


Real System Simulation Analytic Modeling 














Increased level Increased level 
of realism of abstraction 
m Uses a computer program that duplicates the essential 
behavior of a real physical system 
ш Inputs are given to the simulation program 
m Outputs are computed by the simulation program 
m Systematic manipulation of the inputs allows the 
evaluation of alternative decision policies 
m Select most desirable alternative from all possible runs 
m No guarantee that an optimum solution will be found 


Simulation 
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|| Advantages of simulation 


ш Realism - simplifying assumptions for analytical models 
are not needed. Any degree of complexity can be 
handled. 


m Time compression - months of real time can be 
simulated in seconds on the computer (e.g. weather forecasts) 


| m Training - simulation requires less mathematical training 
| than analytical modeling 


m Presentation of results - results are often easier and 
more intuitive to understand, especially if animation is 
used to "see" a proposed system in operation 


m impossible to solve analytically 


m Actual observation or operation is too expensive 
or takes too much time 
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Disadvantages of simulation 












ш Failure to optimize - tyranny of large combinations of 
alternatives (e.g. 3 labor х`4 looms x 8 staffing levels = 96) 

m Long lead times - months of effort might be required to 
develop a major simulation model 

ш Lack of generality of results - results only apply to 


situation in the model and can not be extended out of 
context. Must understand the underlying process to start. 


ш Costs for developing simulation capability - 
hardware, software, training, support, and staffing 

m Model must contain uncertainty - otherwise the 
solution will be deterministic 

m Misuse of simulation - because it is easier to build 

| simulations, people who are not fully qualified сап build models 

Simulation that are incorrect or incomplete. 

OPIM 101 = ‘The Wharton School 

of the University of Pennsylvania 
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Types of simulation 


ш Risk analysis models 

* Probabilistic bounds on ROI to quantify degree of uncertainty 
m Monte-Carlo models 

* Process model with little knowledge of parameter bounds 

™ Crystal Ball 
m Time-based simulation 

* continuous time simulation (e.g. oil refinery processes). 

* discrete event simulation (e.g. bank teller staffing). 

— static models show steady state equilibrium 


— dynamic model show transient response of the system to 
changes in inputs 


Simulation 
OPIM 101 











а = in-process / 
output variable ' 


Current level of work-in-process 
in a manufacturing system 


Simulated time ————»- 
| Simulation 
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lg Random number generation 
бурк . — distribution over the range 5 to 15 





5 10 15 20 25 


=5+10*rand() r= 5 10 * rand() 
= 5 when rand( ) is 0 = 14.9999 when rand( ) is .9999 


m To include both 5 and 15 use: 
r,=5+int(i0*rand()+.1) note: int() returns the integer 


= 15 when rand( ) is .9999 portion of a number 
| Simulation | М harton 
OPIM 101 on School 
v seniore SN 


RR Random number generation 


+ [T aini aiina Anubar | distributed Бао ышы numbers 
A common meon of random number generation is to use a 
multiplicati uential formula R, =k x В, „ (modulo 1) 





R, = .0123456789 У (the initial seed value ) 
k = 100,003 (а constant ) 
R, = 100,003 x .0123456789 (modulo 1) 
= 1234.6049270367 (modulo 1) returns the remainder 
= 6049270367 
А, = 100,003 x .6049270367 (modulo 1) 
= 60,494.5184511101 (modulo 1) 
= 5184511101 


| Simulation 
OPIM 101 
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ARM EXCEL Random number generation 


x RAND() Returns an evenly distributed random number greater than 
or equal to 0 and less than 1. Anew random number is returned 
every time the worksheet is calculated. 


Remarks To generate a random real number between a and b, use: 
RAND()*(b-a)+a 


If you want to use RAND to generate a random number but don't want 
the numbers to change every time the cell is calculated, you can 
enter =RAND() in the formula bar and press F9 (or COMMAND + = 
а Excel for the Macintosh) to change the formula to а randolh 
number. 






Example To generate a random number greater than or equal to 0 
but less than 100: RAND( )*100 


In Visual Basic use RND( ) 


| | Simulation | Wharton 
OPIM 101 ШИ i | The Wharton School 
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Discrete event simulation logic 














я Cumulative frequency functions 


Interarrival time in Cumulative frequency 
hours (time between with which these 
arriving ships) times occur 





0 to 6 0.1 
6 to 12 0.2 
12 to 18 0.9 
18 to 24 1.0 


If random number = 0.7414, 
Е what is the interarrival time (IAT) ? 


Simulation x | О M 
OPIM 101 | | u Te Wharton School 
of the University of Pennsylvania 
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|| Discrete event simulation logic 


Cumulative F enc 
i= H | ative Frequency il d 


(0.9, 18) 


interarrival time 





0.0 0.2 0.4 0.6 92414 0 ç 


1.0 
random number 
| Simulation | | Wharton 
OPIM 101 The ‘Wharton School 


Discrete event simulation logic 





ш Cumulative frequency functions 
Within any interval of the distribution, values of the function are 
uniformly distributed. Interpret cumulative frequency functions 
as a series of straight-line segments that connect the function 
defined ordered pairs of points. 
Interarrival time іп Cumulative frequency 
hours (time between with which these 


arriving ships) times occur 
| 0 t 6 0.1 
6 to 12 0.2 
12 to 18 0.9 
18 to 24 1.0 


IAT, = IAT, + (rand, - cfreq.) x ( (IAT, - IAT,)/ (cfreq, - cfreq,) ) 
if rand, = 0.7414 


= 12 (0.7414 - .2) x ((18 - 12)/(.9 - .2)) 





z 16.6 
Simulation Wharton | 
Simulation 
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|| | || Discrete event simulation logic 


= Maintain “clock” to keep track of events 
ШЕ: -- а = Process events іп time sequence 
ш Determine next type of event 
ш Process according to real world “event logic” 


ш Update statistics describing the simulation 


= Terminate simulation 


| | Simulation 
OPIM 101 
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Harbor Simulation (Textbook page 82-84) 








“Tnidalize model | 
| eh | LLL 
Г Amivaltime “| 


| 
| по Unload 
уреп? 













| Printstatistics | statistics - 






| Unloading time __ | Unloading time __| 


Simulation | LIE 0) 
ОРІМ 101 | i | 
of the University of Pennsyloamia 
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a Harbor Simulation (Textbook page 82-84) 





|| Harbor Simulation (Textbook page 82-84) 


Simulation Page 8 


Statistical analysis 
of simulation output 


m Trade-off between number of berths and 
waiting time 

ш Utilization of berth capacity 

= Maximum number of ships waiting 

m Mean waiting time (23.69 hours transient only) 

m Std Dev (14.36 hours transient only) 

и 95% Confidence interval vs point estimate 
23.69-1.96 x (14.36 / 100^.5) = 22.33 
23.69+1.96 x (14.36 / 100^.5) = 25.05 





ола “л Wharton 
OPIM 101 The Wharton School 
of the University of Pennsylvania 
Statistical analysis 





Simulation Wharton 


OPIM 101 


Simulation 


of simulation output 


For a sample mean x and a sample standard 
deviation s we can construct a confidence interval 
within which we can be reasonably sure contains 
the true population mean. 


95% confidence interval for v= x+ 196-7 


Of course, 5% of the time the population mean will 
fall outside this interval. This is true because the 
sample means are normally distributed and 5% of 
the values of a random variable in a normal 
distribution fall more than 2 standard deviations 
from the mean. 


The Wharton School 
of the University of Pennsylvania 
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Simulation 


Statistical analysis 
x | | ofsimulationoutput | — simulation output 


m iid Assumptions 
• independent replications using unique random numbers 
* identical distribution of random numbers (e.g., still uniform 


from 12 to 16 but the specific random values change) 


m Central Limit Theorem 


* no matter what distribution is followed by the point estimates, 
their average will be approximately normally distributed for 
samples of size 30 or more. This allows us to compute a 
confidence interval using a z-statistic rather than a t-statistic. 

= How many replications? What batch size? 

* depends on computer time and size of model 

* importance of the decision and if needed in "real time" 

* quadrupling the sample size halves the confidence interval 


| Simulation Wharton 
OPIM 101 The "Wharton School | 
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Simulation languages 
and animation 


| STUDENT GPSS/H RELEASE 2.0 (AY130) FILE: ds101b.gps 


SIMULATE base time unit: 1 minute 

ATM STORAGE 1 define number of ATM machines 
GENERATE 2,2 people arrive, one by one 
QUEUE LINE start LINE queue membership 
ENTER АТМ request/capture the ATM machine 
DEPART LINE end LINE queve membership 
ADVANCE 3,1 conduct an ATM transaction 
LEAVE ATM done with the ATM transaction 





0 leave the system 


GENERATE 480 minutes per В hours 


TERMINATE 1 end simulation 
START 1 simulate 480 minutes of service 
END end of Model-File execution 





Simulation 
OPIM 101 


Simulation languages 
and animation 


ATM SIMULATION OUTPUT 
Simulation begins. RELATIVE CLOCK: 480 minutes 


ATM UTILIZATION QUEUE CONTENTS 

163 ENTRIES 227 TOTAL ENTRIES 

1 MAXIMUM CONTENTS 64 MAXIMUM CONTENTS 
2.938 AVERAGE SERVICE 31.595 AVERAGE LINE 

1 CAPACITY CONTENTS 66.809 AVERAGE TIME/UNIT 
1 CURRENT CONTENTS 64 CURRENT CONTENTS 


0.998 AVERAGE UTILIZATION 


Are these statistics realistic? 





OPIM 101 ‘The Wharton School 








Simulation languages 
and animation 


ш Proof Animation 
* post-simulation tool 
* language independent (portable) 
* vector-based and file-driven 
• includes CAD-like tools and CAD file import/export 

m Demonstration 

m Animation provides 

* visual program feedback for debugging 

• enhanced presentation convinces audience of model realism 

* time-consuming to develop 

* might be examined in lieu of rigorous statistical analysis 


Simulation Wharton 
OPIM 101 | The Wharton 5Яоо! 
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OPIM 101 Introduction to the Computer as an Analysis Tool Fall 1995 
Operations and Information Management Department Final Exam 
The Wharton School of the University of Pennsylvania Professor Lohse 






The University of Pennsylvania Academic Code of Integnity states: 


Any work that a student undertakes as part of progress toward a degree or 
certification must be the student's own, unless the relevant instructor specifies 
otherwise. That work may include examinations, whether oral or written, oral 
presentations, laboratory exercises, papers, reports, and other written assignments. 


This two hour exam is closed book and closed notes. All answers must be written neatly or 
printed carefully on the exam. No credit will be given when handwriting can not be read. You 
may separate the pages for convenience in working the exam. Use the back of the pages for 
scratch paper. We have a stapler you can use to refasten the pages when you turn in your exam. 
Write your name and section number on the top of each exam page. Misconduct during an 
examination is a violation of the Code. Any student who violates this Code will receive a zero for 
the work in question and will be referred to the Judicial Inquiry Officer for further action. 


WHARTON REPROGRAPHICS Ф 


I agree to abide by the provisions of the Code of Academic Integrity, and I certify that I 
will comply with the Code in taking this examination. 





Print your name _ Social Security Number 


Circle your section number 
12:00 1:00 2:00 
section 1 section 2 section 3 


_ ow t GOOD LUCK and Season's Greetings from the staff 
Points Available 
20 
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12. Steady-state conditions in a simulation, 
A contain the only observations that should be used for statistical analysis: 





B vary stochastically around some expected value. 


C are independent and easily discernible from transient conditions. 
D none of the above | 


13. Bob Jones is the database administrator for Budget Airlines. Marketing staff are complaining 
to him that reservation data for confirmed passengers exclude two passengers on every 
Boeing 727 plane. This database issue is most related to 


A data independence. 








B data integrity. 
C data dictionary. 
D data security. 


14. All of the following are disadvantages of neural networks except 
A The actual rules embodied in the neural network are not readily apparent 


The neural network can not self-optimize to automatically adjust their parameters to learn 


Over time. 





C Neural networks will not ewok well at solving problems for which suiiigientiy large and 
general sets of training data are not available. 


D Complex, expensive computers with multiple processors are needed for large complex 
problems. 


15. A linear program for maximizing total contribution with an objective function that can be 
made infinitely large without violating any of the constraints is called 


A. redundant. 
. B unbounded. E 
C extreme. | 
D infeasible. 





16. À survey of every student in OPIM 101 finds that 70% feel they will earn a grade of "A". 
heuristic or human information processing bias that best accounts for these survey results is 


A group think 


overconfidence 





illusion of control 
availability 


B 
C 
D 
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17. Which of the following is not a valid URL? 
A http://opim.wharton.upenn.edu 
В ftp//guru.cem.ch 
C gopher://world.std.com:70/ 


D html://library.princeton.edu | | 


18. A neural network used for stock price prediction would most likely 
А be an unsupervised neural network. 


| B bea supervised neural network. | | | 


C bea feedforward neural network. 
D none of the above. 





19. Which of the following techniques is used to find the optimum solution for a certain type of 
large, complex, problems? 
A linear programming 3 
B neural networks 

C discrete-event simulation 
D genetic algorithms 






20. All the of the following are reasons for using simulation models except 
A. for evaluation of the fitness of individuals in a genetic algorithm. 


B for models without uncertainty. 


C when real world measurement and observation of an existing system is too disruptive or 
expensive. 
D when the underlying assumptions for analytical models are not met. 
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(7)Visual Basic Programming and graphs 


Sub question1() "What value is displayed in the message box for countj and counti 
Dim i, j, countj, counti As Integer '1 point each 

counti = 0 

countj = 0 


For j = 100 To 20 Step -3 
count] = county + 1 
For i = -30 То 0 Step 2 
counti = counti + 1 
Next i 


. Nextj | 


MsgBox "The value of count is " & countj, 0, "For outer loop" | 27 
MsgBox "The value of count is " & counti, 0, "For inner Loop" 
End Sub ' 432___ 
Sub question2() "What is the label of the message box displayed by this macro? 
Dim a, b, c, d As Integer '2 points 
Dim flag As Boolean Message Box 1 


flag = False 
If (((a + d) = (c +b) And Not (flag)) Or a = b) Then 
MsgBox "False", 0, "Message Box 1" | 





Else 
MsgBox "True", 0, "Message Box 2" 
End If 
End Sub 
Sub question3Q "What is the value of total displayed in the message box ? 
» Dim total, sum(3), max As Integer 2 points 
max — 3 55 
sum(1) = 20 
sum(2) = 10 
sum(3) = 5 


total = question(sum, max) 

MsgBox "Total equals " & total, 0, "What is the value of total?" 
End Sub | 
Function question(sum, max) 

Dim i As Integer 

For 1 = 1 To max 

question = question + sum(i) * 1 

Next 1 

End Function 
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(1) Graphs | 
The graph below will be presented in a senior level management meeting to discuss marketing 
strategies for Honda. What 15 the graph trying to show? 


Slope of the linear increase in market share over time. 


= 


Comment on the appropriateness of this graph to display the data accurately. 


The dollar signs are difficult to compare and identify a trend. 2D value displays impart no 
information in second dimension. No label for percentage on y axis. Shaded portion at tip of 
dollar sign confuses viewer as to where the data values are. 








Honda's USA Percent Market Share 


w d a A ч 2 


= tJ 






© 


1984 1985 1986 1987 1988 1989 
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(20) Linear programming. Parker Sisters manufactures ball-point pens, mechanical pencils and 
fountain pens. The company is trying to plan its production mix for each week. Joe believes that 
the company can sell any number of pens and pencils it produces, but production is limited. ` 
Because of a recent strike and certain cash-flow problems, the suppliers are only willing to deliver 
at most 1000 ounces of plastic, 1200 ounces of chrome, and 2000 ounces of stainless steel each 


week. These are variable costs since suppliers do not require Parker Sisters to buy a fixed 


amount each week. Use the model below and the Excel Solver output to answer the following 
questions. (Assume each question is independent unless otherwise stated). 








A B C В Е Е G 
1 Parker Sisters Inc. [Ballpoint Mechanical| Fountain | _ 
| 
3 [Number (о manufacture | 700.00 . 0.00 133.33 833.33 
4 $ 3.00 S 3.00 5 5.00 S 2,766.67 
5 | | 
6 Usage Max 
7 и | 1000.00 <= 
8 0.8 0 | 866.67 <= 
9 2 3 5 2000.00 <= 
Sensitivity Report 
| Final Reduced Objective Allowable Allowable 
Name Value Cost Coefficient Increase Decrease 
Number Manufacture PenB 700 ____ 1А... || 3.00 2.00 0.78 
Number Manufacture Pencil — — | 0 -1.38 _ 100 | Eu 8 ЈЕ+30 | 
Number ManufacturePenF 13333 0 50 4175 2.00 


а а аса игы БЫ, крй дона ра eh вена ae e = ы-и c ИНА PS а ла gee ы Ege nee De Le ee ie recen = ынты Bruson eng үл hee лынан el og pos y y gh лакан Ern y Egaga Sra i Mah e a E E e h, а yri i ei e с ы лн с С 


Name | Value Price R.H. Side Increase Decrease 
Plastic Usage — ` 1000 — 117 1000 200 466.67 
Chrome Usage 86667  — 0 s 1200 1130 ^ 33333 
Steel Usage ua 90D . l gg FONT 2000 — 555.56 333.33 


a Name all of the constraints that are binding. Explicitly state each type of constraint! 
(1 point each; 3 points maximum) 

Plastic usage (structural) 

Steel usage (structural) 

Number Manufacture Pencil >= 0 (non-negativity) 


b. A local distributor has offered to sell Parker Sisters an additional 500 ounces of stainless steel 
for $0.60 per ounce more than it ordinarily pays. Should the company buy the steel at this 
price? Support your answer by explaining what happens to Parker Sisters' product mix. What 
is the total weekly contribution if it does buy the stainless steel? | 

(1 point shadow price comparison; 1 points mix; 1 point contribution; 3 points maximum) 

Yes ($0.60 < $0.80) Parker Sisters is willing to pay up to an $0.80 premium for stainless steel. 500 x $0.80 = 

$400 increase in total weekly contribution ($2,766.67 + $400 = $3,166.67). The mix will change! 
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c. What must the minimum contribution margin per mechanical pencil be in order to make them 
worthwhile to produce? Explain your response. 

(2 points; no partial credit) 

At a contribution margin greater than 33.00 + $1.38 (54.38), objective ranging is no longer valid. 

Mechanical pencils may become part of the mix at this point. 


d. Parker Sisters buys its plastic for $5.00 per ounce. This week, Parker Sisters has an 
opportunity to sell 300 ounces of plastic to another company for $6.50 per ounce. The other 
company does not produce pencils or pens and is not a competitor. Should Parker Sisters sell 
the plastic? Support your answer by calculating the change in total weekly contribution if 
Parker Sisters' does sell the plastic? 

(1 point for each part; 4 points maximum) 

(1) The $1.50 premium for plastic ($6.50 - $5.00) is greater than the marginal value to Parker Sisters ( the 

shadow price is only $1.17 ). А 300 ounce decrease is within the allowable range for the shadow price. 

(1) 300 x $1.17 = $351 decrease in total weekly contribution ($2,766.67 - $351 = $2,415.67) 

(1) PLUS a 300 x $1.50 increase from the sale of plastic ($450 + $2,415.67 = $2,865.67) 

(1) The net increase in total weekly contribution is $99.00 per week! 


e. The R&D department at Parker Sisters has been redesigning the mechanical pencil. The new 
design requires 1.1 ounces of plastic, 2.0 ounces of chrome, and 2.0 ounces of stainless steel. 
If the company can sell one of these pencils with a contribution ipu. of $3.00, should it 
approve the new design? Explain your response. 

(0.5 points each correct marginal value; 0.5 points for 2.887; 1 point for explanation; 3 points maximum ) 

1.1 ounces plastic x $1.17 + 2 ounces chrome x $0 + 2.0 ounces steel x $0.80 = 

(1.287 + 0 + 1.60 ) = 52.887 

$2.89 < $3.00 Contribution is greater than marginal cost therefore YES approve! 


f. Marketing believes that the company should produce at least 20 mechanical pencils per week 
to round out its product line. What effect would this have on total weekly contribution 
margin? Explain your response. 

(1 point partial credit for calculation; 1 point for explanation; 2 points maximum) ` 

Reduced cost is $1.38 per pencil x 20 pencils = $27.60 decrease per week 

($2,766.67 - $27.60 = $2,739.07) 


g. If the per-unit contribution margin per ball-point pen decreases to $2.25, what is the new total 
weekly contribution margin and the new product mix? Explain your response. 

(1 point partial credit for calculation; 1 point for explanation; 2 points maximum) 

$2.25 is within the allowable range ($3.00 - 0.78 allowable decrease). 

The mix does not change! 700 Ball-point Pen and 133.33 Fountain Pens 

700 x $0.75 = $525 decrease in total weekly contribution ($2,766.67 - $525 = $2,241.67) 


h. The chrome supplier might have to fulfill an emergency order, and would be able to send only 


1000 ounces of chrome this week instead of the usual 1200 ounces. What effect would this 
have on total weekly contribution margin? Explain your response. 

(1 point; no partial credit) 

No change! They only use 866.67 ounces per week & shadow price is $0 for chrome 
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(20) Decision Analysis. Colonial Motors is trying to determine what size of manufacturing plant 
to build for a new car it 15 developing. Only twe plant sizes are under consideration: large and 
small. The cost of constructing a large plant is million and the cost of constructing me 
plant 15 $15 million. Colonial Motors believes a 70% chance exists that the demand for this new 
car will be high and a 30% chance that it will be low. The following table summarizes expected 
profits as a function of factory size and demand. 
Colonial Motors can purchase a national marketing survey from Megabucks Consulting that will 
measure consumer attitudes towards the new car. Consumer attitudes can be favorable or 
unfavorable. If they are favorable, demand will be high. If they are unfavorable, demand will be 
low. The survey will cost $0.250 million. When demand is high, they successfully predict a 
favorable consumer attitude 6/7 of the time. When demand 15 low, they successfully predict a 
unfavorable consumer attitude 7/9 of the time. The following table summarizes these conditional 
probabilities. 

Conditional probabilities for a 

given level of demand — or m) 


| MM Profits (in BOR) if 
басу Бара певана ат Весі НИШ 


ee et нига а ае аА 
zs |0687 | o | 


Тһе table i is mn 4 points. 16 entries at 25 points each. 
What is the probability that the survey finds a favorable consumer attitude? 2/3 (0.5 points) 


What is the probability that the survey finds a unfavorable consumer attitude? — — 1/3 (0.5 points) 
Complete the decision tree on the next page. Label all branches. Neatly show all calculations. 

ЕМУ without sample information | $126 million (2 points; no partial credit) 
EMV with sample information $126.42 million (2 points; no partial credit) 


EPPI __ | ы $132 million (2 points; no partial credit) 


· EVPI $6 million (3 points; no partial credit) 


EVSI | | $666,667 (3 points; no partial credit) 





page 10 


455 


IT әдва 





— Tm v 






çL ç6$| 1uvig Heus|c/I 


моого 


== = = — — — = =т=т — = 


| luces | 1084 29187] 








SL 68$ | мот 218 m. 


(согот feria eug 


ылту1$ү Weld ә8:е7|с/2 


5601$ [ЧН 060 





еа 28187] 





Pese o£ 0 





000115 | чн оо 
ovos | мот oto 


00'061% | чан о/о 





ләдшпи 1013295 


yuraq вт} 
jujod т 
o|qe1oAgju() 


а 
| cv 9215 15299104 


тат пи гел гастрыт m mama З. 


youeaq вт) - 
jujod ү 
9|qUIOA?] 





hueja ews qouelq sip 





зиеуа оде 7 00'9215) 152910] ON 


| 921$ hueja 28161 


ошви С6ТТҮЯ-10ТАП4О 


456 


OPIMI101-FALL95 name section number 


(10) Database Normalization The inventory tracking system for Lotek Industries identifies the 
location of the corporation's office furniture and computers. The only table in the "database" 
contains the following data shown below. Assume that this data is the entire universe of records. 
(A) Using proper normalization procedures, convert the "database" into a collection of relational 
tables that are all in 3NF. Label the primary keys, foreign keys, and concatenated keys on each 
table. To save time, only state the relations. Do not list the data in each table. 


Office chair — | 23482 | 219 | 2 | Browning | 126 Western Blvd | Ramsey. 
Floor Lamp ____| 123.99 | 303 | 3 | Browning | 126 Western Blvd | Ramsey 
Office chair — | 234.82 | 303 | 3 | Browning | 126 Western Bivd | Ramsey | 
x 


Equipment Room 
027770 [SI 
aS6DX2 Computer [218 
}486DX2 Computer | 303 
Floor Lamp ___| 5 
ru 
FloorLamp ___| _ 219 
Floor Lamp ____| _ 303 
oe ИШ 
| 
Office chair — 
cr msn esa: 
Office chair — | 303 
Office desk __ | 23 
(Office desk | _ 219 









































Equipment Value 
Office chair iii 


Office desk : 989.06 










Room Building _ Floor ` 
_S[Annex | t 
|. „Аш __|- T 

1 












— —33]eyone | — 
— isRamey | — 2 


Building Location Manager 


[Annex — [124 Western Biva 
Bayonne [2318 Lod Circle 
Ramsey [126 Western ВМ 


2 points each correct table - no partial credit 
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(B) Suppose the following record is added to the universe of records. Does the collection of 
relational tables in third normal form created in (A) above change? If so, what changes would 
you make? You may not change any relation. Be certain to list the revised table or tables 
containing this record with all of the data from the data in part (A) above. 














Equipment Value 
[486DX2 Computer | 2367.9 
оог Lamp | 123.9 





234.82 
134.28 
Office desk 989.06 





_ Add new entry into Equipment value table. 
Equipment and Value must be composite primary key: 


1 point - no partial credit 


(C) Suppose the following record is added to the universe of records (in addition to the one 
record in (B) above). There are now three chairs in room 7. Two are priced at $234.82 and one 
is priced at $134.28. Does the collection of relational tables in third normal form created in (A) 


above change? If so, what changes would you make? | 
234.82 124 Western Blvd 













Equipment and Value no longer uniquely identify an inventory item. Must add a new 
attribute that will uniquely key each entry in this table 


(e.g. Equipment identification number). 


1 point - no partial credit 
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(3) Database Transitive Dependencies The following table contains student advising 
information. Students status is based on the following classification. Freshman have 30 credit 
hours or less. Sophomores have 60 credit hours or less. Juniors have 90 credit hours or less. 
Seniors have 91 credit hours or more. An advisor (AdvLname) only advises in his/her major 
teaching field. For example, a Computer Information Systems (CIS) professor only advises CIS 
majors, and English professor only advises English majors, and so on. Each advisor has his/her 
own office (AdvOffice). Advisors do not share offices. Each office has a unique identification. 
For example, there is only one office identified as KOG-109. The only table in the "database" 
contains the following = shown below. Assume that this data is the entire universe of records. 


10025 [Magne | у | хк | Gonzales f KOG2% | 
Marketing Fr | Roberson | KOG-328 
[710027 | English | 48 | So | Ney | VLG4164 > 
[10028 | FieAns | 37 | So | Kalenberger | FA-234H | 
[10029 | Management | 93 | Sr | Gonzales | KOG-209 
[1030 | ci. | 7 | X | Raffeny | CSB320. 
















Ени 1-341 && TE 
__10085 | Engish | 45 | So | Amons == | VLG3202 | 


(A) Make a list of all transitive dependencies in the "database". 







StuClass is a calculated field with a transitive dependency on StuHours 


AdvOffice has a transitive dependency on AdvLname - knowledge of office identifies the advisor 


AdvLname IS NOT DEPENDENT on StuMajor. Look at Nealy. Advisors with the same last 
name teach and advise in different academic areas. If there was a unique faculty ID number, then 
an advisor major transitive dependency would exist. | 


StuMajor IS NOT DEPENDENT on AdvLname. Two advisors have the same last name. Nealy 
апа Antons both advise in English. 


A transitive dependency exists when an attribute is dependent on another attribute that is neither a 
primary key nor part of a composite primary key 


(B) Suppose a student record is deleted from the table. What deletion anomalies may 
occur? Could delete information about all advisors for a majors if only 1 advisor/major 


Information about the faculty advisor and his/her office is lost. 
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(6) Conceptual Database Design | 
a) How many Access files (*.mdb) would be used to store these tables? 1 


b) How many tuples does Degree contain? _ 18 


c) How many attributes does Employee contain? 7 





. d) How many fields does Employee contain? | 7 


e) Identify the primary and foreign keys for each of the following tables. If there is no foreign 
key, enter NONE under the foreign key — If a table has a concatenated key, identify all 


components. 
4 points total; 16 answers (a .25 а 
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(4) SQL Query Part A Write an SQL query that returns a frequency table showing the count of 
the number of employees having a Bachelor's, Master's, or Ph.D. degree. QUAL NAME and 


Count of QUAL CODE are the two attributes in the dynaset. 


Microsoft Access 2.0 SOL  (1point partial credit for each clause; 3 points maximum) 

SELECT DISTINCTROW QUALIFICATION.QUAL NAME, Count(DEGREE.QUAL CODE) AS CountOfQUAL CODE 
FROM QUALIFICATION INNER JOIN DEGREE ON QUALIFICATION.QUAL CODE - DEGREE.QUAL CODE 
WHERE ((DEGREE.QUAL СОРЕ<="3")) 

GROUP BY QUALIFICATION.QUAL NAME; 


Primis Book SQL (1point partial credit for each clause; 3 points maximum) 
SELECT DISTINCTROW QUAL NAME, Count(QUAL CODE) AS CountOfQUAL. CODE 
FROM QUAL 

WHERE QUAL СОрЕ<="3" 

GROUP BY QUAL NAME 


SQL Query Part B What will the resulting SQL query in Part A retum? Show this dynaset 
below. 
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(4) ER Diagram Draw a complete entity-relationship diagram. Include all entities, and 
relationships. Name the relationships and label (1:1, 1:M, and M:M). Do not show any 
attributes. . 





2.S points max; 0.5 points partial credit for each correct relationship 
0.5 points for correct attributes (no partial credit BUT no need to distinguish foreign vs. primary key) 
1 point for correct entities 


page 17 


461 


462 


OPIMIO1-FALL95 пате section number 





(6) Simulation A small convenience grocery store has one check-out counter. Currently, 
customers form a single queue and wait in line. The store 1з very concemed about peak rush 
demand and they are considering adding another check-out counter. Before making this 
investment, the store owner decides to develop a simulation model of the check-out line. An 
industrial engineer studied the patterns regarding how much time lapses before a new customer 
arrives in the line during the noon lunch hour. Customers arrive in line every two minutes. The 
engineer concludes that this time is uniformly distribution with a range from 1 minute to 3 
minutes. Further, it was found that it takes roughly 0.3 minutes per item to check-out. 
Customers purchase from 1 to 10 items and these are also uniformly distributed. You would like 
to build a Visual Basic program to run this simulation. Before you do, you decide to test your 
understanding of the process by developing a manual simulation and run it for 6 minutes. A flow 
chart and table of random numbers are shown below. 


7 + * 

Initialize model 5 
PF 
= 
"и. 
ae 

E TEE 


EES — A 
Sear wr areata ea uk 2r 3 mentia mn УЧ R 
Pate aaa = Seater eris ыы aa 


| Advance clock | 


customer arrivals customer service 


no Check-out 
open? 

























random_number(1) = 0.2068 Be certain that you adhere to standard 
—— T т : š - Z simulation modeling practices. Any arrival 
m ш Š into a model will automatically generate а 
ig iom ји 2 : | = successor. In the event of time ties, you 
random number (6) = 0.2833 must free the server, then update the flow of 
random number(7) = 0.6276 transactions іп the model. Initialize the 
random number(8) = 0.5414 model with first arrival and service time. 
random number (9) = 0.7666 _ 
(n : € T : | s Use the table on the following page to record 
сазана munbeedl2) = 0 ` 6489 the movement of transactions through the 
random number(13) = 0.9564 simulation. Print neatly! Show all work. 
random number(14) = 0.2389 Use effective labels. Carry 3 significant 











digits after the decimal point in your final 
answers. | 
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